Setup a PostgreSQL Cluster With Repmgr and Pgbouncer

Sat, Dec 22, 2018
22-minute read

Warning: This post is over 365 days old. The information may be out of date.

Recently I had to setup a PostgreSQL cluster and one of the prerequisites was to use repmgr.

In this post, I will explain you the work I did and how to setup this kind of cluster.

For the curious, all the documentation for repmgr is available here.

The architecture

My setup is:

node0 / PostgreSQL (192.168.1.50)
node1 / PostgreSQL (192.168.1.51)
node2 / PostgreSQL (192.168.1.52)
pgbouncer / pgbouncer (192.168.1.53)

node0 to node2 are the PostgreSQL machines with one primary and two secondaries.

The pgbouncer machine will receive all the queries and send them to the actual primary.

Setup the machines

For this post, all my machines will be on Debian 9 with PostgreSQL 10.

Install the packages

Install the 2ndQuadrant repository (for repmgr and pgbouncer) on all the PostgreSQL machines

# apt-get install apt-transport-https
# echo "deb https://dl.2ndquadrant.com/default/release/apt stretch-2ndquadrant main" > /etc/apt/sources.list.d/2ndquadrant.list
# wget --quiet -O - https://dl.2ndquadrant.com/gpg-key.asc | apt-key add -

Install PostgreSQL and repmgr on all the PostgreSQL machines
```
# apt-get install postgresql-10 postgresql-10-repmgr
# systemctl stop postgresql
```
We need to stop PostgreSQL because we have to do some configuration.

Install the PostgreSQL repository (for PostgreSQL 10) on all the machines (PostgreSQL and pgbouncer)

# echo "deb http://apt.postgresql.org/pub/repos/apt/ stretch-pgdg main" > /etc/apt/sources.list.d/pgdg.list
# wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add -
# apt-get update

Install pgbouncer and the pgsql client on the pgbouncer machine
```
# apt-get install pgbouncer postgresql-client-10
```

Setup SSH passwordless

Passwordless SSH connectivity between all machines is not mandatory but is required for these cases:

repmgr standby switchover (documentation)
repmgr cluster matrix (documentation)
repmgr cluster crosscheck (documentation)

We will use it for pgbouncer too (as you can see later).

Follow these steps:

Create a new pair of ssh keys and save them in the temporary directory

$ % ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/pea/.ssh/id_rsa): /home/pea/tmp/repmgr/id_rsa
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/pea/tmp/repmgr/id_rsa.
Your public key has been saved in /home/pea/tmp/repmgr/id_rsa.pub.

Create an .ssh directory on all machines (PostgreSQL and pgbouncer) for the postgres user:

# for i in node0 node1 node2 pgbouncer ; do echo ${i} ; ssh root@${i} "mkdir /var/lib/postgresql/.ssh && chown postgres:postgres /var/lib/postgresql/.ssh" ; done

Copy the ssh keys on all the machines (PostgreSQL and pgbouncer)

# for i in node0 node1 node2 pgbouncer ; do echo ${i} ; scp id_rsa* root@${i}:/var/lib/postgresql/.ssh/ ; done
# for i in node0 node1 node2 pgbouncer ; do echo ${i} ; scp id_rsa.pub root@${i}:/var/lib/postgresql/.ssh/authorized_keys ; done

Generate the file known_hosts and copy it to all machines

# ssh-keyscan node0.raveland.priv node1.raveland.priv node2.raveland.priv pgbouncer.raveland.priv > known_hosts
# for i in node0 node1 node2 pgbouncer ; do echo ${i} ; scp known_hosts root@${i}:/var/lib/postgresql/.ssh/ ; done
# for i in node0 node1 node2 pgbouncer ; do echo ${i} ; ssh root@${i} "chown postgres:postgres /var/lib/postgresql/.ssh/*" ; done

WARNING: for the last step, I used the fqdn of my machines. It’s better to do like this to avoid some problems.

Now we have a SSH passwordless connectivity between all machines for the postgres user.

Configuring repmgr

In order to configure repmgr, you will need to:

know the network subnet where repmgr will be deployed
update the PostgreSQL configuration (file postgresql.conf)
update the PostgreSQL client authentication configuration file (file pg_hba.conf)

In order to bootstrap your cluster, you will have to choose a server that will be the primary.

In my case, the network subnet is : 192.168.1.0/24 and the primary will be node0.

HINT: all the repmgr commands must be run as user postgres.

Changes to apply to all PostgreSQL machines

These changes must be applied on all the PostgreSQL machines (both primary and secondaries).

At first, we need to configure sudo.

Create the file /etc/sudoers.d/postgres

# echo "postgres ALL = NOPASSWD: /usr/bin/pg_ctlcluster" > /etc/sudoers.d/postgres

Then we will modify the /etc/postgresql/10/main/postgresql.conf file.

Change listen_addresses to listen_addresses = '*'
Change shared_preload_libraries to shared_preload_libraries = 'repmgr'
Add at the end of the file, this line: include 'postgresql.replication.conf'

Create the file /etc/postgresql/10/main/postgresql.replication.conf with this:

max_wal_senders = 15
max_replication_slots = 15
wal_level = 'replica'
hot_standby = on
archive_mode = on
archive_command = '/bin/true'
wal_keep_segments = 500

Of course, you can adjust these values to your setup.

Now we need to adapt the /etc/postgresql/10/main/pg_hba.conf file. Add the following lines

local   replication   repmgr                              trust
host    replication   repmgr      127.0.0.1/32            trust
host    replication   repmgr      192.168.1.0/24          trust

local   repmgr        repmgr                              trust
host    repmgr        repmgr      127.0.0.1/32            trust
host    repmgr        repmgr      192.168.1.0/24          trust

On Debian, add these lines just after this one : local all postgres peer

Of course, change the subnet 192.168.1.0/24 by yours.

WARNING: you need to adjust pg_hba.conf to allow your applications to connect to the PostgreSQL server.

For this setup, I added at the end: host all all 192.168.1.0/24 md5

Configuring the primary PostgreSQL server

Now that you applied the changes above, follow these steps:

Start PostgreSQL: # systemctl start postgresql

Create the repmgr database and user:

# su - postgres
$ createuser -s repmgr
$ createdb repmgr -O repmgr

Create the file /etc/repmgr.conf (the postgres must be able to read it)

node_id=1
node_name="node0.raveland.priv"
conninfo='host=node0.raveland.priv user=repmgr dbname=repmgr connect_timeout=2'
data_directory='/var/lib/postgresql/10/main'

use_replication_slots=yes
monitoring_history=yes

service_start_command   = 'sudo /usr/bin/pg_ctlcluster 10 main start'
service_stop_command    = 'sudo /usr/bin/pg_ctlcluster 10 main stop'
service_restart_command = 'sudo /usr/bin/pg_ctlcluster 10 main restart'
service_reload_command  = 'sudo /usr/bin/pg_ctlcluster 10 main reload'
service_promote_command = 'sudo /usr/bin/pg_ctlcluster 10 main promote'

promote_check_timeout = 15

failover=automatic
promote_command='/usr/bin/repmgr standby promote -f /etc/repmgr.conf --log-to-file'
follow_command='/usr/bin/repmgr standby follow -f /etc/repmgr.conf --log-to-file --upstream-node-id=%n'

log_file='/var/log/postgresql/repmgrd.log'

In this file, only 4 parameters are mandatory:

node_id
node_name
conninfo
data_directory

You can find a documented repmgr.conf here.

WARNING 1: for the Debian users, it’s recommended to use the wrapper pg_ctlcluster to control PostgreSQL.

WARNING 2: in this configuration, i already setup some instructions for repmgrd.

# su - Postgres
$ repmgr -f /etc/repmgr.conf primary register
INFO: connecting to primary database...
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed
NOTICE: primary node record (id: 1) registered

Now our primary server is up and running.

### Configuring the secondaries PostgreSQL servers

Follow these steps to configure each of your secondaries servers:

Ensure that PostgreSQL is shutdown: systemct stop postgresql
Trash the data directory of PostgreSQL: rm -fR /var/lib/postgresql/10/main/*
Ensure you can connect to the primary node: psql -h node0.raveland.priv -U repmgr

Create the /etc/repmgr.conf file like as on the primary server but adapt the values node_id, node_name and conninfo

node_id=2
node_name="node1.raveland.priv"
conninfo='host=node1.raveland.priv user=repmgr dbname=repmgr connect_timeout=2'
data_directory='/var/lib/postgresql/10/main'

use_replication_slots=yes
monitoring_history=yes

service_start_command   = 'sudo /usr/bin/pg_ctlcluster 10 main start'
service_stop_command    = 'sudo /usr/bin/pg_ctlcluster 10 main stop'
service_restart_command = 'sudo /usr/bin/pg_ctlcluster 10 main restart'
service_reload_command  = 'sudo /usr/bin/pg_ctlcluster 10 main reload'
service_promote_command = 'sudo /usr/bin/pg_ctlcluster 10 main promote'

promote_check_timeout = 15

failover=automatic
promote_command='/usr/bin/repmgr standby promote -f /etc/repmgr.conf --log-to-file'
follow_command='/usr/bin/repmgr standby follow -f /etc/repmgr.conf --log-to-file --upstream-node-id=%n'

log_file='/var/log/postgresql/repmgrd.log'

Clone the primary server (first dry-run)

# su - PostgreSQL
$ repmgr -h node0.raveland.priv -U repmgr -d repmgr -f /etc/repmgr.conf standby clone --dry-run
NOTICE: destination directory "/var/lib/postgresql/10/main" provided
INFO: connecting to source node
DETAIL: connection string is: host=node0.raveland.priv user=repmgr dbname=repmgr
DETAIL: current installation size is 30 MB
NOTICE: standby will attach to upstream node 1
HINT: consider using the -c/--fast-checkpoint option
INFO: all prerequisites for "standby clone" are met

Clone the primary server (for real this time)

$ repmgr -h node0.raveland.priv -U repmgr -d repmgr -f /etc/repmgr.conf standby clone
NOTICE: destination directory "/var/lib/postgresql/10/main" provided
INFO: connecting to source node
DETAIL: connection string is: host=node0.raveland.priv user=repmgr dbname=repmgr
DETAIL: current installation size is 30 MB
INFO: checking and correcting permissions on existing directory "/var/lib/postgresql/10/main"
NOTICE: starting backup (using pg_basebackup)...
HINT: this may take some time; consider using the -c/--fast-checkpoint option
INFO: executing:
  pg_basebackup -l "repmgr base backup"  -D /var/lib/postgresql/10/main -h node0.raveland.priv -p 5432 -U repmgr -X stream -S repmgr_slot_2
NOTICE: standby clone (using pg_basebackup) complete
NOTICE: you can now start your PostgreSQL server
HINT: for example: sudo /usr/bin/pg_ctlcluster 10 main start
HINT: after starting the server, you need to register this standby with "repmgr standby register"

Start the PostgreSQL server

$ sudo /usr/bin/pg_ctlcluster 10 main start

$ repmgr -f /etc/repmgr.conf standby register
INFO: connecting to local node ""node1.raveland.priv"" (ID: 2)
INFO: connecting to primary database
WARNING: --upstream-node-id not supplied, assuming upstream node is primary (node ID 1)
INFO: standby registration complete
NOTICE: standby node ""node1.raveland.priv"" (id: 2) successfully registered

Do the same for all your secondaries

As an example, here is the steps for my node2:

# systemct stop postgresql
# rm -fr /var/lib/postgresql/10/main/*

Create /etc/repmgr.conf

```
node_id=3
node_name="node2.raveland.priv"
conninfo='host=node2.raveland.priv user=repmgr dbname=repmgr connect_timeout=2'
data_directory='/var/lib/postgresql/10/main'

use_replication_slots=yes
monitoring_history=yes

service_start_command   = 'sudo /usr/bin/pg_ctlcluster 10 main start'
service_stop_command    = 'sudo /usr/bin/pg_ctlcluster 10 main stop'
service_restart_command = 'sudo /usr/bin/pg_ctlcluster 10 main restart'
service_reload_command  = 'sudo /usr/bin/pg_ctlcluster 10 main reload'
service_promote_command = 'sudo /usr/bin/pg_ctlcluster 10 main promote'

promote_check_timeout = 15

failover=automatic
promote_command='/usr/bin/repmgr standby promote -f /etc/repmgr.conf --log-to-file'
follow_command='/usr/bin/repmgr standby follow -f /etc/repmgr.conf --log-to-file --upstream-node-id=%n'

log_file='/var/log/postgresql/repmgrd.log'
```

Clone the primary server

$ repmgr -h node0.raveland.priv -U repmgr -d repmgr -f /etc/repmgr.conf standby clone
NOTICE: destination directory "/var/lib/postgresql/10/main" provided
INFO: connecting to source node
DETAIL: connection string is: host=node0.raveland.priv user=repmgr dbname=repmgr
DETAIL: current installation size is 30 MB
INFO: checking and correcting permissions on existing directory "/var/lib/postgresql/10/main"
NOTICE: starting backup (using pg_basebackup)...
HINT: this may take some time; consider using the -c/--fast-checkpoint option
INFO: executing:
  pg_basebackup -l "repmgr base backup"  -D /var/lib/postgresql/10/main -h node0.raveland.priv -p 5432 -U repmgr -X stream -S repmgr_slot_3
NOTICE: standby clone (using pg_basebackup) complete
NOTICE: you can now start your PostgreSQL server
HINT: for example: sudo /usr/bin/pg_ctlcluster 10 main start
HINT: after starting the server, you need to register this standby with "repmgr standby register"

Start PostgreSQL: /usr/bin/pg_ctlcluster 10 main start

$ repmgr -f /etc/repmgr.conf standby register
INFO: connecting to local node ""node2.raveland.priv"" (ID: 3)
INFO: connecting to primary database
WARNING: --upstream-node-id not supplied, assuming upstream node is primary (node ID 1)
INFO: standby registration complete
NOTICE: standby node ""node2.raveland.priv"" (id: 3) successfully registered

## Playing with repmgr

Now all our servers are up and running and we have a running PostgreSQL cluster.

We will see how to do basic operations with repmgr.

REMINDER: all the repmgr commands must be run as postgres user.

### Cluster status

$ repmgr -f /etc/repmgr.conf cluster show
ID | Name                  | Role    | Status    | Upstream              | Location | Connection string                                                   
+++-++++++++++++++++++++++--++++++++++++++++++++--++++++++++++++++++++++--++++++++++-++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1  | "node0.raveland.priv" | primary | * running |                       | default  | host=node0.raveland.priv user=repmgr dbname=repmgr connect_timeout=2
2  | "node1.raveland.priv" | standby |   running | "node0.raveland.priv" | default  | host=node1.raveland.priv user=repmgr dbname=repmgr connect_timeout=2
3  | "node2.raveland.priv" | standby |   running | "node0.raveland.priv" | default  | host=node2.raveland.priv user=repmgr dbname=repmgr connect_timeout=2

### My primary server has failed, what's happen ?

Now we will see what to do when your secondary will stop for an unkown reason.

Shutdown primary (simulate failure)

```bash
# systemctl stop Postgresql    
```

Verify on secondary

```bash
$ repmgr -f /etc/repmgr.conf cluster show
ID | Name                  | Role    | Status        | Upstream              | Location | Connection string                                                   
+++-++++++++++++++++++++++--++++++++++++++++++++++++++++++++++++++++++++++++--++++++++++-++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1  | "node0.raveland.priv" | primary | ? unreachable |                       | default  | host=node0.raveland.priv user=repmgr dbname=repmgr connect_timeout=2
2  | "node1.raveland.priv" | standby |   running     | "node0.raveland.priv" | default  | host=node1.raveland.priv user=repmgr dbname=repmgr connect_timeout=2
3  | "node2.raveland.priv" | standby |   running     | "node0.raveland.priv" | default  | host=node2.raveland.priv user=repmgr dbname=repmgr connect_timeout=2

WARNING: following issues were detected
 unable to connect to node ""node0.raveland.priv"" (ID: 1)
 node ""node0.raveland.priv"" (ID: 1) is registered as an active primary but is unreachable
```

Promote node1 as primary

```bash
$ repmgr -f /etc/repmgr.conf standby promote
NOTICE: promoting standby to primary
DETAIL: promoting server ""node1.raveland.priv"" (ID: 2) using "sudo /usr/bin/pg_ctlcluster 10 main promote"
DETAIL: waiting up to 15 seconds (parameter "promote_check_timeout") for promotion to complete
NOTICE: STANDBY PROMOTE successful
DETAIL: server ""node1.raveland.priv"" (ID: 2) was successfully promoted to primary
```

Verify on node2

```bash
$ repmgr -f /etc/repmgr.conf cluster show
ID | Name                  | Role    | Status               | Upstream              | Location | Connection string                                                   
+++-++++++++++++++++++++++--++++++++++++++++++++++++++++++++-++++++++++++++++++++++--++++++++++-++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1  | "node0.raveland.priv" | primary | ? unreachable        |                       | default  | host=node0.raveland.priv user=repmgr dbname=repmgr connect_timeout=2
2  | "node1.raveland.priv" | standby | ! running as primary | "node0.raveland.priv" | default  | host=node1.raveland.priv user=repmgr dbname=repmgr connect_timeout=2
3  | "node2.raveland.priv" | standby |   running            | "node0.raveland.priv" | default  | host=node2.raveland.priv user=repmgr dbname=repmgr connect_timeout=2

WARNING: following issues were detected
  unable to connect to node ""node0.raveland.priv"" (ID: 1)
  node ""node0.raveland.priv"" (ID: 1) is registered as an active primary but is unreachable
  node ""node1.raveland.priv"" (ID: 2) is registered as standby but running as primary
```

That's **not good** because *node2* is still following *node0*.

Update node2

```bash
$ repmgr -f /etc/repmgr.conf standby follow
NOTICE: setting node 3s primary to node 2
NOTICE: restarting server using "sudo /usr/bin/pg_ctlcluster 10 main restart"
WARNING: unable to connect to old upstream node 1 to remove replication slot
HINT: if reusing this node, you should manually remove any inactive replication slots
NOTICE: STANDBY FOLLOW successful
DETAIL: standby attached to upstream node ""node1.raveland.priv"" (node ID: 2)
```

Verify

```bash
$ repmgr -f /etc/repmgr.conf cluster show
ID | Name                  | Role    | Status    | Upstream              | Location | Connection string                                                   
+++-++++++++++++++++++++++--++++++++++++++++++++--++++++++++++++++++++++--++++++++++-++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1  | "node0.raveland.priv" | primary | - failed  |                       | default  | host=node0.raveland.priv user=repmgr dbname=repmgr connect_timeout=2
2  | "node1.raveland.priv" | primary | * running |                       | default  | host=node1.raveland.priv user=repmgr dbname=repmgr connect_timeout=2
3  | "node2.raveland.priv" | standby |   running | "node1.raveland.priv" | default  | host=node2.raveland.priv user=repmgr dbname=repmgr connect_timeout=2

WARNING: following issues were detected
  unable to connect to node ""node0.raveland.priv"" (ID: 1)
```

Now we are fine (the old primary is still stopped). But as you can see it’s a bit complicated..

### My old primary is back.. what will happen ??

One thing to know about repmgr is that when an old primary comes back, i will stay primary !

Restart the old primary (on node0): systemctl start postgresql

Verify the status on the old primary (node0)

$ repmgr -f /etc/repmgr.conf cluster show
ID | Name                  | Role    | Status               | Upstream              | Location | Connection string                                                   
+++-++++++++++++++++++++++--++++++++++++++++++++++++++++++++-++++++++++++++++++++++--++++++++++-++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1  | "node0.raveland.priv" | primary | * running            |                       | default  | host=node0.raveland.priv user=repmgr dbname=repmgr connect_timeout=2
2  | "node1.raveland.priv" | standby | ! running as primary | "node0.raveland.priv" | default  | host=node1.raveland.priv user=repmgr dbname=repmgr connect_timeout=2
3  | "node2.raveland.priv" | standby |   running            | "node0.raveland.priv" | default  | host=node2.raveland.priv user=repmgr dbname=repmgr connect_timeout=2

WARNING: following issues were detected
   node ""node1.raveland.priv"" (ID: 2) is registered as standby but running as primary

It still thinks it’s a primary server but it detects that something goes wrong.

Verify the status on secondaries

$ repmgr -f /etc/repmgr.conf cluster show
ID | Name                  | Role    | Status    | Upstream              | Location | Connection string                                                   
+++-++++++++++++++++++++++--++++++++++++++++++++--++++++++++++++++++++++--++++++++++-++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1  | "node0.raveland.priv" | primary | ! running |                       | default  | host=node0.raveland.priv user=repmgr dbname=repmgr connect_timeout=2
2  | "node1.raveland.priv" | primary | * running |                       | default  | host=node1.raveland.priv user=repmgr dbname=repmgr connect_timeout=2
3  | "node2.raveland.priv" | standby |   running | "node1.raveland.priv" | default  | host=node2.raveland.priv user=repmgr dbname=repmgr connect_timeout=2

WARNING: following issues were detected
   node ""node0.raveland.priv"" (ID: 1) is running but the repmgr node record is inactive

The secondaries see the old primary but as inactive.

Here are the steps to follow to switch your old primary as a new secondary.

Shutdown PostgreSQL on the old primary : systemctl stop postgresql

Rejoin the node:

$ repmgr -f /etc/repmgr.conf node service --action=stop --checkpoint
$ repmgr -f /etc/repmgr.conf -d 'host=node1.raveland.priv user=repmgr dbname=repmgr' node rejoin
NOTICE: setting node 1's primary to node 2
NOTICE: starting server using "sudo /usr/bin/pg_ctlcluster 10 main start"
NOTICE: replication slot "repmgr_slot_2" deleted on node 1
WARNING: 1 inactive replication slots detected
DETAIL: inactive replication slots:
   repmgr_slot_3 (physical)
HINT: these replication slots may need to be removed manually
NOTICE: NODE REJOIN successful
DETAIL: node 1 is now attached to node 2

Of course, in the host= parameter, you must specify the address of the new primary.

### I want to do a maintenance on my primary server. How can i do ?

As you can see, when you loose a primary, you will have some manual operations to do.

There is an easier way to switch primary server : switchover.

Remember, now the primary is node1 on our cluster. But we need to shutdown it for maintenance.

We want to promote node2 as a new primary. At first try the dry-run:

$ repmgr standby switchover -f /etc/repmgr.conf --siblings-follow --dry-run
NOTICE: checking switchover on node ""node2.raveland.priv"" (ID: 3) in --dry-run mode
INFO: SSH connection to host "node1.raveland.priv" succeeded
INFO: able to execute "repmgr" on remote host "localhost"
INFO: all sibling nodes are reachable via SSH
INFO: 2 walsenders required, 15 available
INFO: demotion candidate is able to make replication connection to promotion candidate
INFO: 0 pending archive files
INFO: replication lag on this standby is 0 seconds
INFO: 2 replication slots required, 15 available
NOTICE: local node ""node2.raveland.priv"" (ID: 3) would be promoted to primary; current primary ""node1.raveland.priv"" (ID: 2) would be demoted to standby
INFO: following shutdown command would be run on node ""node1.raveland.priv"":
  "sudo /usr/bin/pg_ctlcluster 10 main stop"

Everything looks good !

Do the switchover:

$ repmgr standby switchover -f /etc/repmgr.conf --siblings-follow
NOTICE: executing switchover on node ""node2.raveland.priv"" (ID: 3)
NOTICE: local node ""node2.raveland.priv"" (ID: 3) will be promoted to primary; current primary ""node1.raveland.priv"" (ID: 2) will be demoted to standby
NOTICE: stopping current primary node ""node1.raveland.priv"" (ID: 2)
NOTICE: issuing CHECKPOINT
DETAIL: executing server command "sudo /usr/bin/pg_ctlcluster 10 main stop"
INFO: checking for primary shutdown; 1 of 60 attempts ("shutdown_check_timeout")
NOTICE: current primary has been cleanly shut down at location 0/7000028
NOTICE: promoting standby to primary
DETAIL: promoting server ""node2.raveland.priv"" (ID: 3) using "sudo /usr/bin/pg_ctlcluster 10 main promote"
DETAIL: waiting up to 15 seconds (parameter "promote_check_timeout") for promotion to complete
NOTICE: STANDBY PROMOTE successful
DETAIL: server ""node2.raveland.priv"" (ID: 3) was successfully promoted to primary
NOTICE: setting node 2s primary to node 3
NOTICE: starting server using "sudo /usr/bin/pg_ctlcluster 10 main start"
NOTICE: replication slot "repmgr_slot_3" deleted on node 2
WARNING: 1 inactive replication slots detected
DETAIL: inactive replication slots:
    repmgr_slot_1 (physical)
HINT: these replication slots may need to be removed manually
NOTICE: NODE REJOIN successful
DETAIL: node 2 is now attached to node 3
NOTICE: executing STANDBY FOLLOW on 1 of 1 siblings
INFO: STANDBY FOLLOW successfully executed on all reachable sibling nodes
NOTICE: switchover was successful
DETAIL: node ""node2.raveland.priv"" is now primary and node ""node1.raveland.priv"" is attached as standby
NOTICE: STANDBY SWITCHOVER has completed successfully

Verify :

$ repmgr -f /etc/repmgr.conf cluster show
ID | Name                  | Role    | Status    | Upstream              | Location | Connection string                                                   
+++-++++++++++++++++++++++--++++++++++++++++++++--++++++++++++++++++++++--++++++++++-++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1  | "node0.raveland.priv" | standby |   running | "node2.raveland.priv" | default  | host=node0.raveland.priv user=repmgr dbname=repmgr connect_timeout=2
2  | "node1.raveland.priv" | standby |   running | "node2.raveland.priv" | default  | host=node1.raveland.priv user=repmgr dbname=repmgr connect_timeout=2
3  | "node2.raveland.priv" | primary | * running |                       | default  | host=node2.raveland.priv user=repmgr dbname=repmgr connect_timeout=2

Enjoy !!!

Introducing repmgrd: replication manager daemon

Now that you have your PostgreSQL cluster with repmgr, you are happy. But maybe you would like to have a bit of automation.. That’s where repmgrd comes !

As well said in the documentation: repmgrd is a management and monitoring daemon which runs on each node in a replication cluster.

It can automate actions such as failover and updating standbys to follow the new primary, as well as providing monitoring information about the state of each standby.

### Configure repmgrd In this setup, you will just need to modify the file `/etc/default/repmgrd` to enable repmgrd at boot.

Edit the file /etc/default/repmgrd:

REPMGRD_ENABLED=yes
REPMGRD_CONF="/etc/repmgr.conf"
REPMGRD_OPTS="--daemonize=false"
REPMGRD_USER=postgres
REPMGRD_BIN=/usr/bin/repmgrd
REPMGRD_PIDFILE=/var/run/repmgrd.pid

Start the daemon: systemctl restart repmgrd (yes restart)

watch the logs:

$ tail /var/log/postgresql/repmgrd.log
[2018-12-21 21:54:02] [NOTICE] starting monitoring of node ""node0.raveland.priv"" (ID: 1)
[2018-12-21 21:54:02] [INFO] monitoring connection to upstream node ""node2.raveland.priv"" (node ID: 3)

Do this on all the PostgreSQL machines

### Playing with repmgrd
#### The primary fails

Let see what happens when the primary fails for an unknown reason (like the example above).

Shutdown the primary (simulate failure): systemctl stop postgresql

Watch the logs on the secondaries:

$ tail -f /var/log/postgresql/repmgrd.log
[2018-12-21 22:06:04] [WARNING] unable to connect to upstream node ""node2.raveland.priv"" (node ID: 3)
[2018-12-21 22:06:04] [INFO] checking state of node 3, 1 of 6 attempts
...
[2018-12-21 22:06:54] [NOTICE] this node is the winner, will now promote itself and inform other nodes
[2018-12-21 22:06:54] [INFO] promote_command is:
  "/usr/bin/repmgr standby promote -f /etc/repmgr.conf --log-to-file"
[2018-12-21 22:06:54] [NOTICE] redirecting logging output to "/var/log/postgresql/repmgrd.log"
[2018-12-21 22:06:54] [NOTICE] promoting standby to primary
[2018-12-21 22:06:54] [DETAIL] promoting server ""node0.raveland.priv"" (ID: 1) using "sudo /usr/bin/pg_ctlcluster 10 main promote"
[2018-12-21 22:06:54] [DETAIL] waiting up to 15 seconds (parameter "promote_check_timeout") for promotion to complete
[2018-12-21 22:06:54] [NOTICE] STANDBY PROMOTE successful
[2018-12-21 22:06:54] [DETAIL] server ""node0.raveland.priv"" (ID: 1) was successfully promoted to primary
INFO:  node 2 received notification to follow node 1
[2018-12-21 22:06:54] [INFO] switching to primary monitoring mode

This secondary server has been chosen as new primary.

[2018-12-21 22:06:54] [WARNING] unable to reconnect to node 3 after 6 attempts
[2018-12-21 22:06:54] [INFO] follower node awaiting notification from a candidate node
[2018-12-21 22:06:55] [NOTICE] redirecting logging output to "/var/log/postgresql/repmgrd.log"
[2018-12-21 22:06:55] [NOTICE] setting node 2s primary to node 1
[2018-12-21 22:06:55] [NOTICE] restarting server using "sudo /usr/bin/pg_ctlcluster 10 main restart"
[2018-12-21 22:07:03] [WARNING] unable to connect to old upstream node 3 to remove replication slot
[2018-12-21 22:07:03] [HINT] if reusing this node, you should manually remove any inactive replication slots
[2018-12-21 22:07:03] [NOTICE] STANDBY FOLLOW successful
[2018-12-21 22:07:03] [DETAIL] standby attached to upstream node ""node0.raveland.priv"" (node ID: 1)
  INFO:  set_repmgrd_pid(): provided pidfile is /tmp/repmgrd.pid
[2018-12-21 22:07:03] [NOTICE] node 2 now following new upstream node 1
[2018-12-21 22:07:03] [INFO] resuming standby monitoring mode
[2018-12-21 22:07:03] [DETAIL] following new primary ""node0.raveland.priv"" (node id: 1)

This one has been updated to follow the new primary.

Verify the status

$ repmgr -f /etc/repmgr.conf cluster show
ID | Name                  | Role    | Status    | Upstream              | Location | Connection string                                                   
+++-++++++++++++++++++++++--++++++++++++++++++++--++++++++++++++++++++++--++++++++++-++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1  | "node0.raveland.priv" | primary | * running |                       | default  | host=node0.raveland.priv user=repmgr dbname=repmgr connect_timeout=2
2  | "node1.raveland.priv" | standby |   running | "node0.raveland.priv" | default  | host=node1.raveland.priv user=repmgr dbname=repmgr connect_timeout=2
3  | "node2.raveland.priv" | primary | - failed  |                       | default  | host=node2.raveland.priv user=repmgr dbname=repmgr connect_timeout=2

WARNING: following issues were detected
   unable to connect to node ""node2.raveland.priv"" (ID: 3)

Everything is fine !

#### The old primary comes back The problem we had with *repmgr* is the same with *repmgrd*. The old primary **will stay** primary !!

Start the old primary : systemctl start postgresql

Verify :

$ repmgr -f /etc/repmgr.conf cluster show
ID | Name                  | Role    | Status               | Upstream              | Location | Connection string                                                   
+++-++++++++++++++++++++++--++++++++++++++++++++++++++++++++-++++++++++++++++++++++--++++++++++-++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1  | "node0.raveland.priv" | standby | ! running as primary | "node2.raveland.priv" | default  | host=node0.raveland.priv user=repmgr dbname=repmgr connect_timeout=2
2  | "node1.raveland.priv" | standby |   running            | "node2.raveland.priv" | default  | host=node1.raveland.priv user=repmgr dbname=repmgr connect_timeout=2
3  | "node2.raveland.priv" | primary | * running            |                       | default  | host=node2.raveland.priv user=repmgr dbname=repmgr connect_timeout=2

WARNING: following issues were detected
   node ""node0.raveland.priv"" (ID: 1) is registered as standby but running as primary

The solution is the same as before:

Stop the old primary: systemctl stop postgresql
Stop repmgrd : systemct stop repmgrd

Rejoin:

$ repmgr -f /etc/repmgr.conf node service --action=stop --checkpoint
$ repmgr -f /etc/repmgr.conf -d 'host=node0.raveland.priv user=repmgr dbname=repmgr' node rejoin

Start repmgrd: systemctl start repmgrd

## Pgbouncer joins the dance

Now that we have a PostgreSQL cluster with automation thanks to repmgrd, it could be usefull to add pgbouncer into the architecture.

Like this, we will have only one entrypoint for our applications.

### Basic configuration for pgbouncer

At the beginning of this post, we already installed pgbouncer on the machine. Now we will do a basic configuration.

Edit the file /etc/pgbouncer/pgbouncer.ini

[pgbouncer]
logfile = /var/log/postgresql/pgbouncer.log
pidfile = /var/run/postgresql/pgbouncer.pid

listen_addr = *
listen_port = 5432

unix_socket_dir = /var/run/postgresql

auth_type = trust
auth_file = /etc/pgbouncer/userlist.txt

admin_users = postgres
stats_users = postgres

pool_mode = transaction

server_reset_query = DISCARD ALL

%include /etc/pgbouncer/pgbouncer.database.ini

Create an empty file /etc/pgbouncer/pgbouncer.database.ini
Edit the file /etc/pgbouncer/userlist.txt
```
"postgres" "MySuperPasswordAdmin"
```
Set rights: chown postgres:postgres /etc/pgbouncer /etc/pgbouncer/pgbouncer.database.ini
Restart pgbouncer: systemctl restart pgbouncer

### Create a database and a user in PostgreSQL

In order to test our setup, we need to create an user and a database in the PostgreSQL cluster (on the primary of course).

$ psql
psql (10.6 (Debian 10.6-1.pgdg90+1))
Type "help" for help.
postgres=# create role pea login ;
CREATE ROLE
postgres=# \password pea
Enter new password:
Enter it again:
postgres=# create database pea owner pea ;
CREATE DATABASE

### Adapt first configuration for pgbouncer

Now that we have a user, a database and a primary server, we will finish the first configuration for pgbouncer.

Edit the file /etc/pgbouncer/userlist.ini and setup your username and password
```
"pea" "MySuperPassword"
```
Edit the file /etc/pgbouncer/pgbouncer.database.ini
```
[databases]
pea= host=node1.raveland.priv
```

Test pgbouncer (directly from the pgbouncer machine)

# psql -U pea
psql (10.6 (Debian 10.6-1.pgdg90+1))
Type "help" for help.

pea=>

Everything works !

### Integrate pgbouncer with repmgr

The purpose is to have something automatic and each time the cluster’s topology is modified, pgbouncer must be notified.

As you may have understood, when the topology will change, we will generate a new file called pgbouncer.database.ini and send it to pgbouncer.

For this, we need to adjust the configuration of repmgr.

But let me explain you how we will make this happen:

All our PostgreSQL nodes are up and running. Pgbouncer is happy with it’s configuration pointing to the actual primary.
The primary fails…
repmgrd will choose a new primary
The promote command will be executed on the new primary. By default, it’s /usr/bin/pg_ctlcluster 10 main promote. So we will change this command and like

this we will generate a new file with the new primary info, do the promote and send the file to pgbouncer.
Once the promote finished, the other secondaries will execute a follow

#### Update repmgr configuration

The first thing to do is to write a small script that will :

Do the promote
Generate the new configuration file for pgbouncer
Send it to pgbouncer

Create a file /usr/local/bin/promote.sh

#!/usr/bin/env bash
set -e
set -u

PGBOUNCER_DATABASE_INI_NEW="/tmp/pgbouncer.database.ini"
PGBOUNCER_HOSTS="pgbouncer.raveland.priv"
DATABASES="pea"

# Pause pgbouncer
for h in ${PGBOUNCER_HOSTS}
do
  for d in ${DATABASES}
  do
      psql -U postgres -h ${h} -p 5432 pgbouncer -tc "pause ${d}"
  done
done

# Promote server
sudo /usr/bin/pg_ctlcluster 10 main promote

# Generate new config file for pgbouncer
echo -e "[databases]\n" > ${PGBOUNCER_DATABASE_INI_NEW}
for d in ${DATABASES}
do
  echo -e "${d}= host=$(hostname -f)\n" >> ${PGBOUNCER_DATABASE_INI_NEW}
done

# Copy new config file, reload and resume pgbouncer
for h in ${PGBOUNCER_HOSTS}
do
  for d in ${DATABASES}
  do
      rsync -a ${PGBOUNCER_DATABASE_INI_NEW} ${h}:/etc/pgbouncer/pgbouncer.database.ini
      psql -U postgres -h ${h} -p 5432 pgbouncer -tc "reload"
      psql -U postgres -h ${h} -p 5432 pgbouncer -tc "resume ${d}"
  done
done

rm ${PGBOUNCER_DATABASE_INI_NEW}

Please note that you have to fill the PGBOUNCER_HOSTS and DATABASES variables.

Be sure that postgres user can execute this file : chmod 755 /usr/local/bin/promote.sh
Update /etc/repmgr.conf on all PostgreSQL machines and change this line
```
service_promote_command = '/usr/local/bin/promote.sh'
```

#### Let's try

At this point, you should have the script /usr/local/bin/promote.sh deployed on all your PostgreSQL machines and the file /etc/repmgr.conf updated with the latest change.

Shutdown the primary: systemctl stop postgresql
Wait for the new primary

Watch the pgbouncer logs:

$ tail -f /var/log/postgresql/pgbouncer.log
2018-12-21 23:46:51.141 6182 LOG C-0x563b36d38e50: pgbouncer/postgres@192.168.1.51:56442 login attempt: db=pgbouncer user=postgres tls=no
2018-12-21 23:46:51.141 6182 LOG PAUSE 'pea' command issued
2018-12-21 23:46:51.142 6182 LOG C-0x563b36d38e50: pgbouncer/postgres@192.168.1.51:56442 closing because: client close request (age=0)
2018-12-21 23:46:51.566 6182 LOG C-0x563b36d38e50: pgbouncer/postgres@192.168.1.51:56446 login attempt: db=pgbouncer user=postgres tls=no
2018-12-21 23:46:51.566 6182 LOG RELOAD command issued
2018-12-21 23:46:51.566 6182 LOG C-0x563b36d38e50: pgbouncer/postgres@192.168.1.51:56446 closing because: client close request (age=0)
2018-12-21 23:46:51.605 6182 LOG C-0x563b36d38e50: pgbouncer/postgres@192.168.1.51:56448 login attempt: db=pgbouncer user=postgres tls=no
2018-12-21 23:46:51.605 6182 LOG RESUME 'pea' command issued

Victory !!!

## Conclusion

The purpose of this (long) post was to show you how to setup a PostgreSQL cluster with repmgr and pgbouncer.

Of course, I can’t detail all the options of repgmr and you have to do your homeworks :)

But now, at least, you have the basis.

Enjoy 😉

PostgreSQL replication pgen repmgr pgbouncer