Setup a PostgreSQL Cluster With Repmgr and Pgbouncer
Recently I had to setup a PostgreSQL cluster and one of the prerequisites was to use repmgr.
In this post, I will explain you the work I did and how to setup this kind of cluster.
For the curious, all the documentation for repmgr is available here.
The architecture
My setup is:
- node0 / PostgreSQL (192.168.1.50)
- node1 / PostgreSQL (192.168.1.51)
- node2 / PostgreSQL (192.168.1.52)
- pgbouncer / pgbouncer (192.168.1.53)
node0 to node2 are the PostgreSQL machines with one primary and two secondaries.
The pgbouncer machine will receive all the queries and send them to the actual primary.
Setup the machines
For this post, all my machines will be on Debian 9 with PostgreSQL 10.
Install the packages
-
Install the 2ndQuadrant repository (for repmgr and pgbouncer) on all the PostgreSQL machines
# apt-get install apt-transport-https # echo "deb https://dl.2ndquadrant.com/default/release/apt stretch-2ndquadrant main" > /etc/apt/sources.list.d/2ndquadrant.list # wget --quiet -O - https://dl.2ndquadrant.com/gpg-key.asc | apt-key add -
-
Install PostgreSQL and repmgr on all the PostgreSQL machines
# apt-get install postgresql-10 postgresql-10-repmgr # systemctl stop postgresql
We need to stop PostgreSQL because we have to do some configuration.
-
Install the PostgreSQL repository (for PostgreSQL 10) on all the machines (PostgreSQL and pgbouncer)
# echo "deb http://apt.postgresql.org/pub/repos/apt/ stretch-pgdg main" > /etc/apt/sources.list.d/pgdg.list # wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add - # apt-get update
-
Install pgbouncer and the pgsql client on the pgbouncer machine
# apt-get install pgbouncer postgresql-client-10
Setup SSH passwordless
Passwordless SSH connectivity between all machines is not mandatory but is required for these cases:
repmgr standby switchover
(documentation)repmgr cluster matrix
(documentation)repmgr cluster crosscheck
(documentation)
We will use it for pgbouncer too (as you can see later).
Follow these steps:
-
Create a new pair of ssh keys and save them in the temporary directory
$ % ssh-keygen Generating public/private rsa key pair. Enter file in which to save the key (/home/pea/.ssh/id_rsa): /home/pea/tmp/repmgr/id_rsa Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/pea/tmp/repmgr/id_rsa. Your public key has been saved in /home/pea/tmp/repmgr/id_rsa.pub.
-
Create an
.ssh
directory on all machines (PostgreSQL and pgbouncer) for the postgres user:# for i in node0 node1 node2 pgbouncer ; do echo ${i} ; ssh root@${i} "mkdir /var/lib/postgresql/.ssh && chown postgres:postgres /var/lib/postgresql/.ssh" ; done
-
Copy the ssh keys on all the machines (PostgreSQL and pgbouncer)
# for i in node0 node1 node2 pgbouncer ; do echo ${i} ; scp id_rsa* root@${i}:/var/lib/postgresql/.ssh/ ; done # for i in node0 node1 node2 pgbouncer ; do echo ${i} ; scp id_rsa.pub root@${i}:/var/lib/postgresql/.ssh/authorized_keys ; done
-
Generate the file
known_hosts
and copy it to all machines# ssh-keyscan node0.raveland.priv node1.raveland.priv node2.raveland.priv pgbouncer.raveland.priv > known_hosts # for i in node0 node1 node2 pgbouncer ; do echo ${i} ; scp known_hosts root@${i}:/var/lib/postgresql/.ssh/ ; done # for i in node0 node1 node2 pgbouncer ; do echo ${i} ; ssh root@${i} "chown postgres:postgres /var/lib/postgresql/.ssh/*" ; done
WARNING: for the last step, I used the fqdn of my machines. It’s better to do like this to avoid some problems.
Now we have a SSH passwordless connectivity between all machines for the postgres user.
Configuring repmgr
In order to configure repmgr, you will need to:
- know the network subnet where repmgr will be deployed
- update the PostgreSQL configuration (file postgresql.conf)
- update the PostgreSQL client authentication configuration file (file pg_hba.conf)
In order to bootstrap your cluster, you will have to choose a server that will be the primary.
In my case, the network subnet is : 192.168.1.0/24
and the primary will be node0
.
HINT: all the repmgr commands must be run as user postgres.
Changes to apply to all PostgreSQL machines
These changes must be applied on all the PostgreSQL machines (both primary and secondaries).
At first, we need to configure sudo.
-
Create the file
/etc/sudoers.d/postgres
# echo "postgres ALL = NOPASSWD: /usr/bin/pg_ctlcluster" > /etc/sudoers.d/postgres
Then we will modify the /etc/postgresql/10/main/postgresql.conf
file.
-
Change
listen_addresses
tolisten_addresses = '*'
-
Change
shared_preload_libraries
toshared_preload_libraries = 'repmgr'
-
Add at the end of the file, this line:
include 'postgresql.replication.conf'
-
Create the file
/etc/postgresql/10/main/postgresql.replication.conf
with this:max_wal_senders = 15 max_replication_slots = 15 wal_level = 'replica' hot_standby = on archive_mode = on archive_command = '/bin/true' wal_keep_segments = 500
Of course, you can adjust these values to your setup.
-
Now we need to adapt the
/etc/postgresql/10/main/pg_hba.conf
file. Add the following lineslocal replication repmgr trust host replication repmgr 127.0.0.1/32 trust host replication repmgr 192.168.1.0/24 trust local repmgr repmgr trust host repmgr repmgr 127.0.0.1/32 trust host repmgr repmgr 192.168.1.0/24 trust
On Debian, add these lines just after this one :
local all postgres peer
Of course, change the subnet
192.168.1.0/24
by yours.WARNING: you need to adjust
pg_hba.conf
to allow your applications to connect to the PostgreSQL server.For this setup, I added at the end:
host all all 192.168.1.0/24 md5
Configuring the primary PostgreSQL server
Now that you applied the changes above, follow these steps:
-
Start PostgreSQL:
# systemctl start postgresql
-
Create the repmgr database and user:
# su - postgres $ createuser -s repmgr $ createdb repmgr -O repmgr
-
Create the file
/etc/repmgr.conf
(the postgres must be able to read it)node_id=1 node_name="node0.raveland.priv" conninfo='host=node0.raveland.priv user=repmgr dbname=repmgr connect_timeout=2' data_directory='/var/lib/postgresql/10/main' use_replication_slots=yes monitoring_history=yes service_start_command = 'sudo /usr/bin/pg_ctlcluster 10 main start' service_stop_command = 'sudo /usr/bin/pg_ctlcluster 10 main stop' service_restart_command = 'sudo /usr/bin/pg_ctlcluster 10 main restart' service_reload_command = 'sudo /usr/bin/pg_ctlcluster 10 main reload' service_promote_command = 'sudo /usr/bin/pg_ctlcluster 10 main promote' promote_check_timeout = 15 failover=automatic promote_command='/usr/bin/repmgr standby promote -f /etc/repmgr.conf --log-to-file' follow_command='/usr/bin/repmgr standby follow -f /etc/repmgr.conf --log-to-file --upstream-node-id=%n' log_file='/var/log/postgresql/repmgrd.log'
In this file, only 4 parameters are mandatory:
- node_id
- node_name
- conninfo
- data_directory
You can find a documented repmgr.conf here.
WARNING 1: for the Debian users, it’s recommended to use the wrapper pg_ctlcluster to control PostgreSQL.
WARNING 2: in this configuration, i already setup some instructions for repmgrd.
-
Register the primary server:
# su - Postgres $ repmgr -f /etc/repmgr.conf primary register INFO: connecting to primary database... NOTICE: attempting to install extension "repmgr" NOTICE: "repmgr" extension successfully installed NOTICE: primary node record (id: 1) registered
Now our primary server is up and running.
### Configuring the secondaries PostgreSQL serversFollow these steps to configure each of your secondaries servers:
-
Ensure that PostgreSQL is shutdown:
systemct stop postgresql
-
Trash the data directory of PostgreSQL:
rm -fR /var/lib/postgresql/10/main/*
-
Ensure you can connect to the primary node:
psql -h node0.raveland.priv -U repmgr
-
Create the
/etc/repmgr.conf
file like as on the primary server but adapt the valuesnode_id
,node_name
andconninfo
node_id=2 node_name="node1.raveland.priv" conninfo='host=node1.raveland.priv user=repmgr dbname=repmgr connect_timeout=2' data_directory='/var/lib/postgresql/10/main' use_replication_slots=yes monitoring_history=yes service_start_command = 'sudo /usr/bin/pg_ctlcluster 10 main start' service_stop_command = 'sudo /usr/bin/pg_ctlcluster 10 main stop' service_restart_command = 'sudo /usr/bin/pg_ctlcluster 10 main restart' service_reload_command = 'sudo /usr/bin/pg_ctlcluster 10 main reload' service_promote_command = 'sudo /usr/bin/pg_ctlcluster 10 main promote' promote_check_timeout = 15 failover=automatic promote_command='/usr/bin/repmgr standby promote -f /etc/repmgr.conf --log-to-file' follow_command='/usr/bin/repmgr standby follow -f /etc/repmgr.conf --log-to-file --upstream-node-id=%n' log_file='/var/log/postgresql/repmgrd.log'
-
Clone the primary server (first dry-run)
# su - PostgreSQL $ repmgr -h node0.raveland.priv -U repmgr -d repmgr -f /etc/repmgr.conf standby clone --dry-run NOTICE: destination directory "/var/lib/postgresql/10/main" provided INFO: connecting to source node DETAIL: connection string is: host=node0.raveland.priv user=repmgr dbname=repmgr DETAIL: current installation size is 30 MB NOTICE: standby will attach to upstream node 1 HINT: consider using the -c/--fast-checkpoint option INFO: all prerequisites for "standby clone" are met
-
Clone the primary server (for real this time)
$ repmgr -h node0.raveland.priv -U repmgr -d repmgr -f /etc/repmgr.conf standby clone NOTICE: destination directory "/var/lib/postgresql/10/main" provided INFO: connecting to source node DETAIL: connection string is: host=node0.raveland.priv user=repmgr dbname=repmgr DETAIL: current installation size is 30 MB INFO: checking and correcting permissions on existing directory "/var/lib/postgresql/10/main" NOTICE: starting backup (using pg_basebackup)... HINT: this may take some time; consider using the -c/--fast-checkpoint option INFO: executing: pg_basebackup -l "repmgr base backup" -D /var/lib/postgresql/10/main -h node0.raveland.priv -p 5432 -U repmgr -X stream -S repmgr_slot_2 NOTICE: standby clone (using pg_basebackup) complete NOTICE: you can now start your PostgreSQL server HINT: for example: sudo /usr/bin/pg_ctlcluster 10 main start HINT: after starting the server, you need to register this standby with "repmgr standby register"
-
Start the PostgreSQL server
$ sudo /usr/bin/pg_ctlcluster 10 main start
-
Register the secondary
$ repmgr -f /etc/repmgr.conf standby register INFO: connecting to local node ""node1.raveland.priv"" (ID: 2) INFO: connecting to primary database WARNING: --upstream-node-id not supplied, assuming upstream node is primary (node ID 1) INFO: standby registration complete NOTICE: standby node ""node1.raveland.priv"" (id: 2) successfully registered
-
Do the same for all your secondaries
-
As an example, here is the steps for my node2:
-
# systemct stop postgresql
-
# rm -fr /var/lib/postgresql/10/main/*
-
Create
/etc/repmgr.conf
``` node_id=3 node_name="node2.raveland.priv" conninfo='host=node2.raveland.priv user=repmgr dbname=repmgr connect_timeout=2' data_directory='/var/lib/postgresql/10/main' use_replication_slots=yes monitoring_history=yes service_start_command = 'sudo /usr/bin/pg_ctlcluster 10 main start' service_stop_command = 'sudo /usr/bin/pg_ctlcluster 10 main stop' service_restart_command = 'sudo /usr/bin/pg_ctlcluster 10 main restart' service_reload_command = 'sudo /usr/bin/pg_ctlcluster 10 main reload' service_promote_command = 'sudo /usr/bin/pg_ctlcluster 10 main promote' promote_check_timeout = 15 failover=automatic promote_command='/usr/bin/repmgr standby promote -f /etc/repmgr.conf --log-to-file' follow_command='/usr/bin/repmgr standby follow -f /etc/repmgr.conf --log-to-file --upstream-node-id=%n' log_file='/var/log/postgresql/repmgrd.log' ```
-
Clone the primary server
$ repmgr -h node0.raveland.priv -U repmgr -d repmgr -f /etc/repmgr.conf standby clone NOTICE: destination directory "/var/lib/postgresql/10/main" provided INFO: connecting to source node DETAIL: connection string is: host=node0.raveland.priv user=repmgr dbname=repmgr DETAIL: current installation size is 30 MB INFO: checking and correcting permissions on existing directory "/var/lib/postgresql/10/main" NOTICE: starting backup (using pg_basebackup)... HINT: this may take some time; consider using the -c/--fast-checkpoint option INFO: executing: pg_basebackup -l "repmgr base backup" -D /var/lib/postgresql/10/main -h node0.raveland.priv -p 5432 -U repmgr -X stream -S repmgr_slot_3 NOTICE: standby clone (using pg_basebackup) complete NOTICE: you can now start your PostgreSQL server HINT: for example: sudo /usr/bin/pg_ctlcluster 10 main start HINT: after starting the server, you need to register this standby with "repmgr standby register"
-
Start PostgreSQL:
/usr/bin/pg_ctlcluster 10 main start
-
Register the secondary
$ repmgr -f /etc/repmgr.conf standby register INFO: connecting to local node ""node2.raveland.priv"" (ID: 3) INFO: connecting to primary database WARNING: --upstream-node-id not supplied, assuming upstream node is primary (node ID 1) INFO: standby registration complete NOTICE: standby node ""node2.raveland.priv"" (id: 3) successfully registered
-
-
Now all our servers are up and running and we have a running PostgreSQL cluster.
We will see how to do basic operations with repmgr.
REMINDER: all the repmgr commands must be run as postgres user.
### Cluster status$ repmgr -f /etc/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Connection string
+++-++++++++++++++++++++++--++++++++++++++++++++--++++++++++++++++++++++--++++++++++-++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 | "node0.raveland.priv" | primary | * running | | default | host=node0.raveland.priv user=repmgr dbname=repmgr connect_timeout=2
2 | "node1.raveland.priv" | standby | running | "node0.raveland.priv" | default | host=node1.raveland.priv user=repmgr dbname=repmgr connect_timeout=2
3 | "node2.raveland.priv" | standby | running | "node0.raveland.priv" | default | host=node2.raveland.priv user=repmgr dbname=repmgr connect_timeout=2
Now we will see what to do when your secondary will stop for an unkown reason.
-
Shutdown primary (simulate failure)
```bash # systemctl stop Postgresql ```
-
Verify on secondary
```bash $ repmgr -f /etc/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Connection string +++-++++++++++++++++++++++--++++++++++++++++++++++++++++++++++++++++++++++++--++++++++++-++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 | "node0.raveland.priv" | primary | ? unreachable | | default | host=node0.raveland.priv user=repmgr dbname=repmgr connect_timeout=2 2 | "node1.raveland.priv" | standby | running | "node0.raveland.priv" | default | host=node1.raveland.priv user=repmgr dbname=repmgr connect_timeout=2 3 | "node2.raveland.priv" | standby | running | "node0.raveland.priv" | default | host=node2.raveland.priv user=repmgr dbname=repmgr connect_timeout=2 WARNING: following issues were detected unable to connect to node ""node0.raveland.priv"" (ID: 1) node ""node0.raveland.priv"" (ID: 1) is registered as an active primary but is unreachable ```
-
Promote node1 as primary
```bash $ repmgr -f /etc/repmgr.conf standby promote NOTICE: promoting standby to primary DETAIL: promoting server ""node1.raveland.priv"" (ID: 2) using "sudo /usr/bin/pg_ctlcluster 10 main promote" DETAIL: waiting up to 15 seconds (parameter "promote_check_timeout") for promotion to complete NOTICE: STANDBY PROMOTE successful DETAIL: server ""node1.raveland.priv"" (ID: 2) was successfully promoted to primary ```
-
Verify on node2
```bash $ repmgr -f /etc/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Connection string +++-++++++++++++++++++++++--++++++++++++++++++++++++++++++++-++++++++++++++++++++++--++++++++++-++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 | "node0.raveland.priv" | primary | ? unreachable | | default | host=node0.raveland.priv user=repmgr dbname=repmgr connect_timeout=2 2 | "node1.raveland.priv" | standby | ! running as primary | "node0.raveland.priv" | default | host=node1.raveland.priv user=repmgr dbname=repmgr connect_timeout=2 3 | "node2.raveland.priv" | standby | running | "node0.raveland.priv" | default | host=node2.raveland.priv user=repmgr dbname=repmgr connect_timeout=2 WARNING: following issues were detected unable to connect to node ""node0.raveland.priv"" (ID: 1) node ""node0.raveland.priv"" (ID: 1) is registered as an active primary but is unreachable node ""node1.raveland.priv"" (ID: 2) is registered as standby but running as primary ``` That's **not good** because *node2* is still following *node0*.
-
Update node2
```bash $ repmgr -f /etc/repmgr.conf standby follow NOTICE: setting node 3s primary to node 2 NOTICE: restarting server using "sudo /usr/bin/pg_ctlcluster 10 main restart" WARNING: unable to connect to old upstream node 1 to remove replication slot HINT: if reusing this node, you should manually remove any inactive replication slots NOTICE: STANDBY FOLLOW successful DETAIL: standby attached to upstream node ""node1.raveland.priv"" (node ID: 2) ```
-
Verify
```bash $ repmgr -f /etc/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Connection string +++-++++++++++++++++++++++--++++++++++++++++++++--++++++++++++++++++++++--++++++++++-++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 | "node0.raveland.priv" | primary | - failed | | default | host=node0.raveland.priv user=repmgr dbname=repmgr connect_timeout=2 2 | "node1.raveland.priv" | primary | * running | | default | host=node1.raveland.priv user=repmgr dbname=repmgr connect_timeout=2 3 | "node2.raveland.priv" | standby | running | "node1.raveland.priv" | default | host=node2.raveland.priv user=repmgr dbname=repmgr connect_timeout=2 WARNING: following issues were detected unable to connect to node ""node0.raveland.priv"" (ID: 1) ```
Now we are fine (the old primary is still stopped). But as you can see it’s a bit complicated..
### My old primary is back.. what will happen ??One thing to know about repmgr is that when an old primary comes back, i will stay primary !
-
Restart the old primary (on node0):
systemctl start postgresql
-
Verify the status on the old primary (node0)
$ repmgr -f /etc/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Connection string +++-++++++++++++++++++++++--++++++++++++++++++++++++++++++++-++++++++++++++++++++++--++++++++++-++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 | "node0.raveland.priv" | primary | * running | | default | host=node0.raveland.priv user=repmgr dbname=repmgr connect_timeout=2 2 | "node1.raveland.priv" | standby | ! running as primary | "node0.raveland.priv" | default | host=node1.raveland.priv user=repmgr dbname=repmgr connect_timeout=2 3 | "node2.raveland.priv" | standby | running | "node0.raveland.priv" | default | host=node2.raveland.priv user=repmgr dbname=repmgr connect_timeout=2 WARNING: following issues were detected node ""node1.raveland.priv"" (ID: 2) is registered as standby but running as primary
It still thinks it’s a primary server but it detects that something goes wrong.
-
Verify the status on secondaries
$ repmgr -f /etc/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Connection string +++-++++++++++++++++++++++--++++++++++++++++++++--++++++++++++++++++++++--++++++++++-++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 | "node0.raveland.priv" | primary | ! running | | default | host=node0.raveland.priv user=repmgr dbname=repmgr connect_timeout=2 2 | "node1.raveland.priv" | primary | * running | | default | host=node1.raveland.priv user=repmgr dbname=repmgr connect_timeout=2 3 | "node2.raveland.priv" | standby | running | "node1.raveland.priv" | default | host=node2.raveland.priv user=repmgr dbname=repmgr connect_timeout=2 WARNING: following issues were detected node ""node0.raveland.priv"" (ID: 1) is running but the repmgr node record is inactive
The secondaries see the old primary but as inactive.
Here are the steps to follow to switch your old primary as a new secondary.
-
Shutdown PostgreSQL on the old primary :
systemctl stop postgresql
-
Rejoin the node:
$ repmgr -f /etc/repmgr.conf node service --action=stop --checkpoint $ repmgr -f /etc/repmgr.conf -d 'host=node1.raveland.priv user=repmgr dbname=repmgr' node rejoin NOTICE: setting node 1's primary to node 2 NOTICE: starting server using "sudo /usr/bin/pg_ctlcluster 10 main start" NOTICE: replication slot "repmgr_slot_2" deleted on node 1 WARNING: 1 inactive replication slots detected DETAIL: inactive replication slots: repmgr_slot_3 (physical) HINT: these replication slots may need to be removed manually NOTICE: NODE REJOIN successful DETAIL: node 1 is now attached to node 2
Of course, in the
host=
parameter, you must specify the address of the new primary.
As you can see, when you loose a primary, you will have some manual operations to do.
There is an easier way to switch primary server : switchover
.
Remember, now the primary is node1 on our cluster. But we need to shutdown it for maintenance.
-
We want to promote node2 as a new primary. At first try the dry-run:
$ repmgr standby switchover -f /etc/repmgr.conf --siblings-follow --dry-run NOTICE: checking switchover on node ""node2.raveland.priv"" (ID: 3) in --dry-run mode INFO: SSH connection to host "node1.raveland.priv" succeeded INFO: able to execute "repmgr" on remote host "localhost" INFO: all sibling nodes are reachable via SSH INFO: 2 walsenders required, 15 available INFO: demotion candidate is able to make replication connection to promotion candidate INFO: 0 pending archive files INFO: replication lag on this standby is 0 seconds INFO: 2 replication slots required, 15 available NOTICE: local node ""node2.raveland.priv"" (ID: 3) would be promoted to primary; current primary ""node1.raveland.priv"" (ID: 2) would be demoted to standby INFO: following shutdown command would be run on node ""node1.raveland.priv"": "sudo /usr/bin/pg_ctlcluster 10 main stop"
Everything looks good !
-
Do the switchover:
$ repmgr standby switchover -f /etc/repmgr.conf --siblings-follow NOTICE: executing switchover on node ""node2.raveland.priv"" (ID: 3) NOTICE: local node ""node2.raveland.priv"" (ID: 3) will be promoted to primary; current primary ""node1.raveland.priv"" (ID: 2) will be demoted to standby NOTICE: stopping current primary node ""node1.raveland.priv"" (ID: 2) NOTICE: issuing CHECKPOINT DETAIL: executing server command "sudo /usr/bin/pg_ctlcluster 10 main stop" INFO: checking for primary shutdown; 1 of 60 attempts ("shutdown_check_timeout") NOTICE: current primary has been cleanly shut down at location 0/7000028 NOTICE: promoting standby to primary DETAIL: promoting server ""node2.raveland.priv"" (ID: 3) using "sudo /usr/bin/pg_ctlcluster 10 main promote" DETAIL: waiting up to 15 seconds (parameter "promote_check_timeout") for promotion to complete NOTICE: STANDBY PROMOTE successful DETAIL: server ""node2.raveland.priv"" (ID: 3) was successfully promoted to primary NOTICE: setting node 2s primary to node 3 NOTICE: starting server using "sudo /usr/bin/pg_ctlcluster 10 main start" NOTICE: replication slot "repmgr_slot_3" deleted on node 2 WARNING: 1 inactive replication slots detected DETAIL: inactive replication slots: repmgr_slot_1 (physical) HINT: these replication slots may need to be removed manually NOTICE: NODE REJOIN successful DETAIL: node 2 is now attached to node 3 NOTICE: executing STANDBY FOLLOW on 1 of 1 siblings INFO: STANDBY FOLLOW successfully executed on all reachable sibling nodes NOTICE: switchover was successful DETAIL: node ""node2.raveland.priv"" is now primary and node ""node1.raveland.priv"" is attached as standby NOTICE: STANDBY SWITCHOVER has completed successfully
-
Verify :
$ repmgr -f /etc/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Connection string +++-++++++++++++++++++++++--++++++++++++++++++++--++++++++++++++++++++++--++++++++++-++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 | "node0.raveland.priv" | standby | running | "node2.raveland.priv" | default | host=node0.raveland.priv user=repmgr dbname=repmgr connect_timeout=2 2 | "node1.raveland.priv" | standby | running | "node2.raveland.priv" | default | host=node1.raveland.priv user=repmgr dbname=repmgr connect_timeout=2 3 | "node2.raveland.priv" | primary | * running | | default | host=node2.raveland.priv user=repmgr dbname=repmgr connect_timeout=2
Enjoy !!!
Introducing repmgrd: replication manager daemon
Now that you have your PostgreSQL cluster with repmgr, you are happy. But maybe you would like to have a bit of automation.. That’s where repmgrd comes !
As well said in the documentation: repmgrd is a management and monitoring daemon which runs on each node in a replication cluster.
It can automate actions such as failover and updating standbys to follow the new primary, as well as providing monitoring information about the state of each standby.
### Configure repmgrd In this setup, you will just need to modify the file `/etc/default/repmgrd` to enable repmgrd at boot.-
Edit the file
/etc/default/repmgrd
:REPMGRD_ENABLED=yes REPMGRD_CONF="/etc/repmgr.conf" REPMGRD_OPTS="--daemonize=false" REPMGRD_USER=postgres REPMGRD_BIN=/usr/bin/repmgrd REPMGRD_PIDFILE=/var/run/repmgrd.pid
-
Start the daemon:
systemctl restart repmgrd
(yes restart) -
watch the logs:
$ tail /var/log/postgresql/repmgrd.log [2018-12-21 21:54:02] [NOTICE] starting monitoring of node ""node0.raveland.priv"" (ID: 1) [2018-12-21 21:54:02] [INFO] monitoring connection to upstream node ""node2.raveland.priv"" (node ID: 3)
-
Do this on all the PostgreSQL machines
Let see what happens when the primary fails for an unknown reason (like the example above).
-
Shutdown the primary (simulate failure):
systemctl stop postgresql
-
Watch the logs on the secondaries:
$ tail -f /var/log/postgresql/repmgrd.log [2018-12-21 22:06:04] [WARNING] unable to connect to upstream node ""node2.raveland.priv"" (node ID: 3) [2018-12-21 22:06:04] [INFO] checking state of node 3, 1 of 6 attempts ... [2018-12-21 22:06:54] [NOTICE] this node is the winner, will now promote itself and inform other nodes [2018-12-21 22:06:54] [INFO] promote_command is: "/usr/bin/repmgr standby promote -f /etc/repmgr.conf --log-to-file" [2018-12-21 22:06:54] [NOTICE] redirecting logging output to "/var/log/postgresql/repmgrd.log" [2018-12-21 22:06:54] [NOTICE] promoting standby to primary [2018-12-21 22:06:54] [DETAIL] promoting server ""node0.raveland.priv"" (ID: 1) using "sudo /usr/bin/pg_ctlcluster 10 main promote" [2018-12-21 22:06:54] [DETAIL] waiting up to 15 seconds (parameter "promote_check_timeout") for promotion to complete [2018-12-21 22:06:54] [NOTICE] STANDBY PROMOTE successful [2018-12-21 22:06:54] [DETAIL] server ""node0.raveland.priv"" (ID: 1) was successfully promoted to primary INFO: node 2 received notification to follow node 1 [2018-12-21 22:06:54] [INFO] switching to primary monitoring mode
This secondary server has been chosen as new primary.
[2018-12-21 22:06:54] [WARNING] unable to reconnect to node 3 after 6 attempts [2018-12-21 22:06:54] [INFO] follower node awaiting notification from a candidate node [2018-12-21 22:06:55] [NOTICE] redirecting logging output to "/var/log/postgresql/repmgrd.log" [2018-12-21 22:06:55] [NOTICE] setting node 2s primary to node 1 [2018-12-21 22:06:55] [NOTICE] restarting server using "sudo /usr/bin/pg_ctlcluster 10 main restart" [2018-12-21 22:07:03] [WARNING] unable to connect to old upstream node 3 to remove replication slot [2018-12-21 22:07:03] [HINT] if reusing this node, you should manually remove any inactive replication slots [2018-12-21 22:07:03] [NOTICE] STANDBY FOLLOW successful [2018-12-21 22:07:03] [DETAIL] standby attached to upstream node ""node0.raveland.priv"" (node ID: 1) INFO: set_repmgrd_pid(): provided pidfile is /tmp/repmgrd.pid [2018-12-21 22:07:03] [NOTICE] node 2 now following new upstream node 1 [2018-12-21 22:07:03] [INFO] resuming standby monitoring mode [2018-12-21 22:07:03] [DETAIL] following new primary ""node0.raveland.priv"" (node id: 1)
This one has been updated to follow the new primary.
-
Verify the status
$ repmgr -f /etc/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Connection string +++-++++++++++++++++++++++--++++++++++++++++++++--++++++++++++++++++++++--++++++++++-++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 | "node0.raveland.priv" | primary | * running | | default | host=node0.raveland.priv user=repmgr dbname=repmgr connect_timeout=2 2 | "node1.raveland.priv" | standby | running | "node0.raveland.priv" | default | host=node1.raveland.priv user=repmgr dbname=repmgr connect_timeout=2 3 | "node2.raveland.priv" | primary | - failed | | default | host=node2.raveland.priv user=repmgr dbname=repmgr connect_timeout=2 WARNING: following issues were detected unable to connect to node ""node2.raveland.priv"" (ID: 3)
Everything is fine !
#### The old primary comes back The problem we had with *repmgr* is the same with *repmgrd*. The old primary **will stay** primary !!-
Start the old primary :
systemctl start postgresql
-
Verify :
$ repmgr -f /etc/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Connection string +++-++++++++++++++++++++++--++++++++++++++++++++++++++++++++-++++++++++++++++++++++--++++++++++-++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 | "node0.raveland.priv" | standby | ! running as primary | "node2.raveland.priv" | default | host=node0.raveland.priv user=repmgr dbname=repmgr connect_timeout=2 2 | "node1.raveland.priv" | standby | running | "node2.raveland.priv" | default | host=node1.raveland.priv user=repmgr dbname=repmgr connect_timeout=2 3 | "node2.raveland.priv" | primary | * running | | default | host=node2.raveland.priv user=repmgr dbname=repmgr connect_timeout=2 WARNING: following issues were detected node ""node0.raveland.priv"" (ID: 1) is registered as standby but running as primary
The solution is the same as before:
-
Stop the old primary:
systemctl stop postgresql
-
Stop repmgrd :
systemct stop repmgrd
-
Rejoin:
$ repmgr -f /etc/repmgr.conf node service --action=stop --checkpoint $ repmgr -f /etc/repmgr.conf -d 'host=node0.raveland.priv user=repmgr dbname=repmgr' node rejoin
-
Start repmgrd:
systemctl start repmgrd
Now that we have a PostgreSQL cluster with automation thanks to repmgrd, it could be usefull to add pgbouncer into the architecture.
Like this, we will have only one entrypoint for our applications.
### Basic configuration for pgbouncerAt the beginning of this post, we already installed pgbouncer on the machine. Now we will do a basic configuration.
-
Edit the file
/etc/pgbouncer/pgbouncer.ini
[pgbouncer] logfile = /var/log/postgresql/pgbouncer.log pidfile = /var/run/postgresql/pgbouncer.pid listen_addr = * listen_port = 5432 unix_socket_dir = /var/run/postgresql auth_type = trust auth_file = /etc/pgbouncer/userlist.txt admin_users = postgres stats_users = postgres pool_mode = transaction server_reset_query = DISCARD ALL %include /etc/pgbouncer/pgbouncer.database.ini
-
Create an empty file
/etc/pgbouncer/pgbouncer.database.ini
-
Edit the file
/etc/pgbouncer/userlist.txt
"postgres" "MySuperPasswordAdmin"
-
Set rights:
chown postgres:postgres /etc/pgbouncer /etc/pgbouncer/pgbouncer.database.ini
-
Restart pgbouncer:
systemctl restart pgbouncer
In order to test our setup, we need to create an user and a database in the PostgreSQL cluster (on the primary of course).
$ psql
psql (10.6 (Debian 10.6-1.pgdg90+1))
Type "help" for help.
postgres=# create role pea login ;
CREATE ROLE
postgres=# \password pea
Enter new password:
Enter it again:
postgres=# create database pea owner pea ;
CREATE DATABASE
Now that we have a user, a database and a primary server, we will finish the first configuration for pgbouncer.
-
Edit the file
/etc/pgbouncer/userlist.ini
and setup your username and password"pea" "MySuperPassword"
-
Edit the file
/etc/pgbouncer/pgbouncer.database.ini
[databases] pea= host=node1.raveland.priv
-
Test pgbouncer (directly from the pgbouncer machine)
# psql -U pea psql (10.6 (Debian 10.6-1.pgdg90+1)) Type "help" for help. pea=>
Everything works !
### Integrate pgbouncer with repmgrThe purpose is to have something automatic and each time the cluster’s topology is modified, pgbouncer must be notified.
As you may have understood, when the topology will change, we will generate a new file called pgbouncer.database.ini
and send it to pgbouncer.
For this, we need to adjust the configuration of repmgr.
But let me explain you how we will make this happen:
-
All our PostgreSQL nodes are up and running. Pgbouncer is happy with it’s configuration pointing to the actual primary.
-
The primary fails…
-
repmgrd will choose a new primary
-
The promote command will be executed on the new primary. By default, it’s
/usr/bin/pg_ctlcluster 10 main promote
. So we will change this command and likethis we will generate a new file with the new primary info, do the promote and send the file to pgbouncer.
-
Once the promote finished, the other secondaries will execute a follow
The first thing to do is to write a small script that will :
-
Do the promote
-
Generate the new configuration file for pgbouncer
-
Send it to pgbouncer
-
Create a file
/usr/local/bin/promote.sh
#!/usr/bin/env bash set -e set -u PGBOUNCER_DATABASE_INI_NEW="/tmp/pgbouncer.database.ini" PGBOUNCER_HOSTS="pgbouncer.raveland.priv" DATABASES="pea" # Pause pgbouncer for h in ${PGBOUNCER_HOSTS} do for d in ${DATABASES} do psql -U postgres -h ${h} -p 5432 pgbouncer -tc "pause ${d}" done done # Promote server sudo /usr/bin/pg_ctlcluster 10 main promote # Generate new config file for pgbouncer echo -e "[databases]\n" > ${PGBOUNCER_DATABASE_INI_NEW} for d in ${DATABASES} do echo -e "${d}= host=$(hostname -f)\n" >> ${PGBOUNCER_DATABASE_INI_NEW} done # Copy new config file, reload and resume pgbouncer for h in ${PGBOUNCER_HOSTS} do for d in ${DATABASES} do rsync -a ${PGBOUNCER_DATABASE_INI_NEW} ${h}:/etc/pgbouncer/pgbouncer.database.ini psql -U postgres -h ${h} -p 5432 pgbouncer -tc "reload" psql -U postgres -h ${h} -p 5432 pgbouncer -tc "resume ${d}" done done rm ${PGBOUNCER_DATABASE_INI_NEW}
Please note that you have to fill the PGBOUNCER_HOSTS and DATABASES variables.
-
Be sure that postgres user can execute this file :
chmod 755 /usr/local/bin/promote.sh
-
Update
/etc/repmgr.conf
on all PostgreSQL machines and change this lineservice_promote_command = '/usr/local/bin/promote.sh'
At this point, you should have the script /usr/local/bin/promote.sh
deployed on all your PostgreSQL machines and the file /etc/repmgr.conf
updated with the latest change.
-
Shutdown the primary:
systemctl stop postgresql
-
Wait for the new primary
-
Watch the pgbouncer logs:
$ tail -f /var/log/postgresql/pgbouncer.log 2018-12-21 23:46:51.141 6182 LOG C-0x563b36d38e50: pgbouncer/postgres@192.168.1.51:56442 login attempt: db=pgbouncer user=postgres tls=no 2018-12-21 23:46:51.141 6182 LOG PAUSE 'pea' command issued 2018-12-21 23:46:51.142 6182 LOG C-0x563b36d38e50: pgbouncer/postgres@192.168.1.51:56442 closing because: client close request (age=0) 2018-12-21 23:46:51.566 6182 LOG C-0x563b36d38e50: pgbouncer/postgres@192.168.1.51:56446 login attempt: db=pgbouncer user=postgres tls=no 2018-12-21 23:46:51.566 6182 LOG RELOAD command issued 2018-12-21 23:46:51.566 6182 LOG C-0x563b36d38e50: pgbouncer/postgres@192.168.1.51:56446 closing because: client close request (age=0) 2018-12-21 23:46:51.605 6182 LOG C-0x563b36d38e50: pgbouncer/postgres@192.168.1.51:56448 login attempt: db=pgbouncer user=postgres tls=no 2018-12-21 23:46:51.605 6182 LOG RESUME 'pea' command issued
Victory !!!
## ConclusionThe purpose of this (long) post was to show you how to setup a PostgreSQL cluster with repmgr and pgbouncer.
Of course, I can’t detail all the options of repgmr and you have to do your homeworks :)
But now, at least, you have the basis.
Enjoy 😉