postgresql 9.0 ha - rmllschedule2012.rmll.info/img/pdf/postgresql-9-0-ha.pdf · postgresql 9.0 ha...

46
Overview PostgreSQL 9.0 Clustering Backups and monitoring Automation The end PostgreSQL 9.0 HA Julien Pivotto Inuits July 9, 2012 Julien Pivotto PostgreSQL 9.0 HA

Upload: buique

Post on 24-Jan-2019

222 views

Category:

Documents


0 download

TRANSCRIPT

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

PostgreSQL 9.0 HAJulien Pivotto

Inuits

July 9, 2012

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

The missionBefore the migration

Table of content1 Overview

The missionBefore the migration

2 PostgreSQL 9.0IntroStreaming replicationMaster configurationSlave configuration

3 ClusteringSet up of corosyncOCF resourceSynthesis

4 Backups and monitoringBackups with cron jobsBackupPCNagiosMunin

5 AutomationPuppet moduleThe node file

6 The end

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

The missionBefore the migration

Who am I

• Julien Pivotto• @roidelapluie• [email protected]• Consultant at Inuits (Belgium) since May 2011• FOSS defender since 2005

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

The missionBefore the migration

A.R.S.I.A.

• Association Régionale de Santé et d’Identification Animales(Belgique)

• 30 linux servers in several locations• A lot of Open Source:

• CentOS, Samba, Open-xchange, mailscanner, Cyrus,• . . . Puppet, jenkins, foreman, OpenVPN, GLPI, rabbitmq,• . . . BackupPC, CUPS, icinga, trac, zope, plone,• . . . solr, pentaho, funambol, munin, squid, asterisk,• . . . and PostgreSQL, . . .

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

The missionBefore the migration

C.E.R.I.S.E

• A web application:• Plone (python)• 15k+ visits, 500k+ pages and 2.000.000+ hits each month• Developped by another open-source company, Affinitic

• Several databases:• PostgreSQL 9.0• Oracle database

• Several servers/services:• Reverse proxies• Application servers• PostgreSQL servers• Oracledb server• Development server• Pentaho server

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

The missionBefore the migration

PostgreSQL before the migration

• PostgreSQL 8.3.7• No native support of HA• High availability with heartbeat 2 and DRBD• Installed on the application servers• Nothing automated• Failover: Passive node is not even read only

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

The missionBefore the migration

Monitoring before the installation

• Icinga• Check of the DRBD• Simple connection check to PostgreSQL

• Graphing with Cacti:• Size of the databases• Connexions to the database• Checkpoints

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

The missionBefore the migration

Backups before the installation

• Hourly local backups• External backups (on disk and on tape)• Made with pg_dump command• External backups tool: BackupPC

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

IntroStreaming replicationMaster configurationSlave configuration

PostgreSQL 9.0

• Released in september 2010• First version with native replication• No native failover tool• We need to use PostgreSQL + Corosync

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

IntroStreaming replicationMaster configurationSlave configuration

Write-Ahead Logging

• Every change to datafile is first written into a log file• Less disk writes• More cohesion• Logs can be replayed

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

IntroStreaming replicationMaster configurationSlave configuration

What is streaming replication

• Ship and apply WAL XLOGS to standby servers• Possibility to have multiple standby servers• Standby servers can be read-only ("Hot standby")

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

IntroStreaming replicationMaster configurationSlave configuration

DisadvantagesSpecifications of streaming replication

• Asynchronous log-shipping• Delay is often very close to 0

• No native HA features in PostgreSQL• We will use corosync

• Single-threaded replication• Single-threaded replication. . .

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

IntroStreaming replicationMaster configurationSlave configuration

Master configuration

Only one specific configuration file.

Configuration related to SRwal_level = hot_standbymax_wal_senders = 2wal_keep_segments = 128

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

IntroStreaming replicationMaster configurationSlave configuration

Master configuration

• wal_level = hot_standbyAllows stanby server to be readable

• max_wal_senders = 2We allow up to 2 standby nodes

• wal_keep_segments = 128The minimum wal segments to keep

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

IntroStreaming replicationMaster configurationSlave configuration

Slave configuration

• Two configuration files• PostgreSQL.conf file• Recovery.conf file

• A trigger file to stop replication can be specified

PostgreSQL.conf - Configuration related to SRwal_level = hot_standbyhot_standby = on

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

IntroStreaming replicationMaster configurationSlave configuration

Slave configuration

recovery.confstandby_mode = ‘on’primary_conninfo = ‘host=192.168.142.31 user=replicuser’

• standby_mode = standby server• primary_conninfo = connection to the master

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

IntroStreaming replicationMaster configurationSlave configuration

Replication user

• Super user called replication is created

SQL query to create super userCREATE USER replication SUPERUSER LOGIN CONNECTION LIMIT 1 ENCRYPTED PASSWORD ‘foobar’;

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

IntroStreaming replicationMaster configurationSlave configuration

pg_hba.conf

• pg_hba.conf contains some kind of ACLs for the PostgreSQLconnections

• Need to add the nodes as trusted.

pg_hba.confhostnossl all all 192.168.142.31/32 trusthostnossl all all 192.168.142.31/32 trusthostnossl replication replicuser 192.168.142.32/32 trust

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

IntroStreaming replicationMaster configurationSlave configuration

Setting up a slave

• 3 commands required to add a slave

Adding a standby serverpsql -c "SELECT pg_start_backup(’label’, true)"rsync -a ${PGDATA}/ standby:/srv/pgsql/standby/ --exclude postmaster.pid --exclude ‘*-master’--exclude ‘*-slave’psql -c "SELECT pg_stop_backup()"

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

Set up of corosyncOCF resourceSynthesis

Corosync configuration

• Goal: switching between master/slave• One and only one node should be master• Master need to be connected to the router• Custom master/slave ocf resource

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

Set up of corosyncOCF resourceSynthesis

crm.conf

• Main configuration file of corosync is/etc/corosync/crm.conf.

• Contains all the resources/nodes/etc. . .

Defining the nodesnode dupont.demo \attributes standby="off"

node dupond.demo \attributes standby="on"

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

Set up of corosyncOCF resourceSynthesis

crm.conf

Primitivesprimitive pgsql ocf:inuits:pgsql-msprimitive virt_ip ocf:heartbeat:IPaddr2 \params nic="eth1" iflabel="0" ip="192.168.142.30" cidr_netmask="24" broadcast="192.168.142.255"

\meta target-role="Started" is-managed="true"

primitive ping ocf:pacemaker:ping \params host_list="192.168.142.1" \op monitor interval="10s" timeout="10s" \op start interval="0" timeout="45s" \op stop interval="0" timeout="50s"

• 3 primitives:• pgsql, the PostgreSQL primitive• virt_ip, the floating IP address• ping, the primitive that will check that the servers are

connected to the router

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

Set up of corosyncOCF resourceSynthesis

crm.conf

Configuring the primitivesms pgsql-ms pgsql \params pgsqlconfig="/var/lib/pgsql/data/postgresql.conf" \lsb_script="/etc/init.d/postgresql-9.0" \pgsqlrecovery="/var/lib/pgsql/data/recovery.conf" \meta clone-max="2" clone-node-max="1" master-max="1" master-node-max="1" notify="false"

clone clone-ping ping \meta globally-unique="false"

• PostgreSQL M/S configuration: the init script, theconfiguration files. . .

• Ping resource as a clone (launched on both servers)

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

Set up of corosyncOCF resourceSynthesis

crm.conf

group and colocationgroup PSQL virt_iplocation connected PSQL \rule $id="connected-rule" -inf: not_defined pingd or pingd lte 0

colocation ip_psql inf: PSQL pgsql-ms:Masterproperty $id="cib-bootstrap-options" \cluster-infrastructure="openais" \expected-quorum-votes="2" \stonith-enabled="false" \no-quorum-policy="ignore" \default-resource-stickiness="INFINITY"

rsc_defaults $id="rsc_defaults-options" \migration-threshold="INFINITY" \failure-timeout="10" \resource-stickiness="INFINITY"

• Master should always be on the same node as the floating IPaddress

• Master should be connected to the router

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

Set up of corosyncOCF resourceSynthesis

OCF resource

• Custom OCF resource to manage the master/slavePostgreSQL

• Based on an example of resource written by Andrew Beekhoffrom Clusterlabs

• The file is in /usr/lib/ocf/resource.d/inuits/pgsql-ms• Available on github!https://github.com/roidelapluie/puppet-cluster

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

Set up of corosyncOCF resourceSynthesis

OCF resource

• The script does the following:• Move the postgresql.conf-master to postgresql.conf

when a node is promoted/master• Move the postgresql.conf-slave to postgresql.conf

when a node is depromoted/slave• Ensure that recovery.conf-slave is on recovery.conf on

slave and absent on master• Starts/restarts PostgreSQL when needed.

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

Set up of corosyncOCF resourceSynthesis

Synthesis

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

Set up of corosyncOCF resourceSynthesis

Synthesis

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

Set up of corosyncOCF resourceSynthesis

Synthesis

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

Backups with cron jobsBackupPCNagiosMunin

Backups of the databases

• The databases are quite small (a few gigabytes)• Hourly on each node (one at minute 0 and one at minute 30)• Daily external backup of each node• We keep 24 hourly backups and 7 daily backups on disk• BackupPC: we keep months of backups

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

Backups with cron jobsBackupPCNagiosMunin

Hourly backup script

/usr/local/bin/backup_hourly.sh#!/bin/bashDATE=$(date +%H)BACKUP_PATH=/var/lib/backups/hourlyfor db in foobar_db foobar2_dbdo/usr/bin/pg_dump $db | gzip > $BACKUP_PATH/${db}_$DATE.pgsql.gzln -fs $BACKUP_PATH/${db}_$DATE.pgsql.gz $BACKUP_PATH/${db}_current.pgsql.gz

done

The daily script is almost the same.

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

Backups with cron jobsBackupPCNagiosMunin

BackupPC script

/usr/local/bin/backup_backuppc.sh#!/bin/bashDATE=$(date +%u)BACKUP_PATH=/var/lib/backups/backuppcfor db in cerise trackitquality trackit zodb_cerisedo/usr/bin/pg_dump -U postgres $db | gzip > $BACKUP_PATH/${db}_$DATE.pgsql.gzln -fs $BACKUP_PATH/${db}_$DATE.pgsql.gz $BACKUP_PATH/${db}_current.pgsql.gz

done

In the backupPC config, I added the following:

BackupPC config$Conf{DumpPreUserCmd} = ‘$sshPath -t -q -x -l backuppc $host /usr/local/bin/backup_backuppc.sh’;

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

Backups with cron jobsBackupPCNagiosMunin

check_postgres script

• check_postgres.pl is a nagios-compatible perl script• Available on http://www.bucardo.org/check_postgres/and on Github

• Checks:• Current connections number• Replication delay

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

Backups with cron jobsBackupPCNagiosMunin

Check hot_standby latency

• The check_postgres.pl script has a check for hot_standbydelay

• We do not know who is the master and the slave• Here is a bash script I wrote to know the M/S order

Master/slave replication check#!/bin/bash/usr/lib64/nagios/plugins/check_postgres.pl --db="$1" \--action hot_standby_delay -w 300 -c 600 --host=$(crm_resource --resource pgsql-ms --locate|awk ‘/Master/ {master=$6} / $/ {slave=$6} END {print master","slave}’)

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

Backups with cron jobsBackupPCNagiosMunin

Munin postgres scripts

• Munin is shipped with perl plugins for postgresql• We use four of them:

• postgres_size,• postgres_checkpoints,• postgres_connections_db,• postgres_cache

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

Backups with cron jobsBackupPCNagiosMunin

Munin postgres scripts

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

Puppet moduleThe node file

Puppet module

• Puppet-postgres module forked from Kris Buytaert’s githubpage

• Forked to remove all references to services• Creates the users, the super users, the databases• Parameterized classes• The cache sizes are parameterized too (useful for Vagrantboxes)

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

Puppet moduleThe node file

Sqlexec definition

sqlexec.ppdefine postgres::sqlexec($username, $database, $sql, $sqlcheck, $host=’localhost’, $password=""){exec{ "psql -h $host –username=${username} $database

-c $̈{sql}&̈& /bin/sleep 5":environment => "PGPASSWORD=${password}",path => $::path,timeout => 600,unless => "psql -h $host -U $username $database -c $sqlcheck",require => Service[’postgresql-9.0’],

}}

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

Puppet moduleThe node file

Example in the node file

Here is the result in the node file:

site.ppnode postgres-server {

include postgresinclude siteclass {

’cluster’:cluster_bind_interface => ’eth1’,cluster_mcastaddr => ’226.94.1.1’,

}postgres::hba {

$::fqdn:password => ’yes’,allowedrules => [

"host all all $::ipaddress/32 password",’hostnossl replication replicuser 192.168.142.31/24 trust’,’hostnossl replication replicuser 192.168.142.32/24 trust’,’hostnossl all all 192.168.142.0/24 password’,

]}include site::sshkeysPostgres::Cluster[$::fqdn] -> Class[’cluster’]

}

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

Puppet moduleThe node file

Example in the node file

dupont.ppnode dupont inherits postgres-server {

postgres::cluster {$::fqdn:other_node_ip => ’192.168.142.31’,listen => ’*’,max_connections => ’100’,shared_buffers => ’120MB’,work_mem => ’3MB’,effective_cache_size => ’352MB’,

}class {

’postgres::firstsync’:remotehost => ’dupont’,password => ’rmll2012’,

}Class[’postgres’] -> Postgres::Cluster[$::fqdn]

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

Puppet moduleThe node file

Example in the node file

dupont.ppexec {

’/bin/sleep 10’:refreshonly => true,subscribe => Exec[’load_crm_config’],

}exec {

’crm node online dupond.demo’:refreshonly => true,path => $::path,subscribe => Exec[’load_crm_config’],require => Exec[’/bin/sleep 10’, ’crm node online dupond.demo’],

}Class[’site’] -> Class[’cluster’] -> Class[’postgres::firstsync’]Class[’postgres::firstsync’] -> Postgres::Hba[$::fqdn]Class[’cluster’] -> Exec[’/bin/sleep 10’] -> Class[’postgres::firstsync’]

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

Puppet moduleThe node file

Example in the node file

dupont.pppostgres::createuser {

’tintin’:password => ’rmll2012’,host => ’192.168.142.30’,passwd => ’milou’,require => Exec[’/bin/sleep 10’],

}postgres::createdb {

’moulinsart’:password => ’rmll2012’,host => ’192.168.142.30’,owner => ’tintin’,require => [Exec[’/bin/sleep 10’],Postgres::Createuser[’tintin’]],

}postgres::createsuperuser {

’replicuser’:password => ’rmll2012’,host => ’192.168.142.30’,passwd => ’icanhazapassword’,require => Exec[’/bin/sleep 10’],

}}

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

Test it yourself

You will need vagrant (http://vagrantup.com).

Test-it-yourself howto

$ git clonegit://github.com/roidelapluie/postgres9-demo-vagrant.git$ cd postgres9-demo-vagrant$ git submodule update --init$ vagrant up

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

Any questions?(I stay all the week @RMLL)

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

• Presentation made with latex-beamer• PostgreSQL logo is used following the policyhttp://wiki.postgresql.org/wiki/Trademark_Policy

• Tango icons have been released into the public domain• Database icon is under CC BY-SA 3.0http://dracos.deviantart.com/#/d2y5ele

• RMLL logo is under CC BY-SA 3.0http://2012.rmll.info/kit-media

Julien Pivotto PostgreSQL 9.0 HA

OverviewPostgreSQL 9.0

ClusteringBackups and monitoring

AutomationThe end

Thank you

Julien Pivotto PostgreSQL 9.0 HA