Table of Contents
Patroni:
Patroni is an open-source tool used to manage PostgreSQL HA clusters. It’s a lightweight solution that uses the PostgreSQL replication mechanism to ensure that all nodes in the cluster have an up-to-date copy of the database. Patroni provides automatic failover and ensures that only one node is active at any given time.
HAProxy:
HAProxy is a load balancer that can be used to distribute incoming traffic across multiple PostgreSQL nodes in a cluster. It acts as a proxy between clients and the active node in the cluster and ensures that requests are routed to the appropriate node. HAProxy can also detect when a node becomes unavailable and automatically switch to a different node in the cluster.
Keepalived:
Keepalived is a Linux-based software that provides high availability for Linux systems. It can be used to monitor the health of the PostgreSQL nodes and automatically switch the virtual IP address to a healthy node in case of a failure. This ensures that client connections are redirected to the active node in the cluster.
etcd:
etcd is a distributed key-value store used to store configuration data for the PostgreSQL HA cluster. It’s used by Patroni to manage the state of the cluster, and by HAProxy and Keepalived to determine which node is currently active. etcd ensures that the configuration data is consistent across all nodes in the cluster.
Watchdog:
The watchdog process is responsible for monitoring the health of the PostgreSQL node and its associated processes. The watchdog process ensures that the PostgreSQL node is running correctly and that it’s able to serve client requests. If the watchdog detects an issue with the node or its processes, it can take action to recover the node or initiate failover to a standby node.
System Requirements
To follow this tutorial along, you will need a minimum of 6 virtual machines installed with CentOS 8. The purpose is defined is below
Hostname | IPAddress | Purpose | Optional |
p1 | 192.168.110.131 | etcd, HAProxy, keepalived | PostgreSQL |
p2 | 192.168.110.132 | etcd, HAProxy, keepalived | PostgreSQL |
p3 | 192.168.110.133 | etcd, HAProxy, keepalived | PostgreSQL |
p4 | 192.168.110.134 | PostgreSQL+Patroni | |
p5 | 192.168.110.135 | PostgreSQL+Patroni | |
p5 | 192.168.110.136 | PostgreSQL+Patroni | |
ha-vip | 192.168.110.140 | HA-VIP |
In this lab set up,
- I will use the words p1, p2 etc., to refer machine 1, 2 and so on. For example., execute “the command in p1″ means execute the command in “192.168.110.131” as mentioned in table above.
- If I mention [all machines] indicate that commands have to be executed in all machines.
- If I mention [… accordingly] it means you need to make necessary changes as mentioned in color red in each of the machines.
Create postgres user [all machines]
useradd postgres passwd postgres
Set the hostname [all machines accordingly]
sudo hostnamectl set-hostname p1 reboot
You should see something like this
[root@p1 ~]#
For machine 2, it should be
sudo hostnamectl set-hostname p2
reboot
and repeat for p3,p4,p5 and p6.,
Get IP Addresses of all machines and edit /etc/hosts file: [all machines]
vi /etc/hosts
add the below lines
192.168.110.131 p1 192.168.110.132 p2 192.168.110.133 p3 192.168.110.134 p4 192.168.110.135 p5 192.168.110.136 p6 192.168.110.140 ha-vip
Here 192.168.110.140 can be any IP which is not being used and is in the network, I choose 192.168.110.140 randomly. we use his IP address in keepalived configuration below.
Disable Selinux [all machines]
Log in to p1 and edit /etc/selinux/config file with any of your favorite text editor:
vi /etc/selinux/config
Change SELINUX=enforcing to SELINUX=disabled
SELINUX=disabled
Reboot to make the selinux changes effect:
reboot
Configure Firewalld
To set up Patroni for PostgreSQL with additional components like HAProxy, etcd, Keepalived, Pgbouncer, and web servers the following ports need to be opened:
- Patroni:
- 8008: This port is used for the HTTP API of Patroni.
- HAProxy:
- 5000: This port is used for HTTP connections to the backend (PostgreSQL service).
- 5001: This port is used for HTTPS connections to the backend (PostgreSQL service).
- etcd:
- 2379: This port is used for etcd client communication.
- 2380: This port is used for etcd server-to-server communication.
- Keepalived:
- 112: This port is used for the Virtual Router Redundancy Protocol (VRRP) communication between Keepalived instances.
- 5405: This port is used for the multicast traffic between Keepalived instances.
- Pgbouncer:
- 6432: This port is used for Pgbouncer client connections.
- PostgreSQL:
- 5432: This port is used for PostgreSQL client connections.
- Web server:
- 5000: This port is used for HTTP connections to the web server.
- 5001: This port is used for HTTPS connections to the web server.
- 7000: This port is used for the reverse proxy connection from the web server to HAProxy.
These ports are required to allow communication between the different components of the setup. However, the specific port configuration can vary depending on your specific setup and requirements. It’s important to adjust firewall rules and network configurations accordingly to ensure the proper functioning and security of the setup.
Run the below commands to open above ports [all machines]
firewall-cmd --zone=public --add-port=5432/tcp --permanent firewall-cmd --zone=public --add-port=6432/tcp --permanent firewall-cmd --zone=public --add-port=8008/tcp --permanent firewall-cmd --zone=public --add-port=2379/tcp --permanent firewall-cmd --zone=public --add-port=2380/tcp --permanent firewall-cmd --permanent --zone=public --add-service=http firewall-cmd --zone=public --add-port=5000/tcp --permanent firewall-cmd --zone=public --add-port=5001/tcp --permanent firewall-cmd --zone=public --add-port=7000/tcp --permanent firewall-cmd --zone=public --add-port=112/tcp --permanent firewall-cmd --zone=public --add-port=5405/tcp --permanent firewall-cmd --add-rich-rule='rule protocol value="vrrp" accept' --permanent firewall-cmd --reload
Set up etcd for PostgreSQL 15 HA with Patroni
Install etcd [p1,p2,p3]
Create etcd.repo file like below:
vi /etc/yum.repos.d/etcd.repo
Add following:
[etcd] name=PostgreSQL common RPMs for RHEL / Rocky $releasever - $basearch baseurl=http://ftp.postgresql.org/pub/repos/yum/common/pgdg-rhel8-extras/redhat/rhel-$releasever-$basearch enabled=1 gpgcheck=1 gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG repo_gpgcheck = 1
Save and close the editor when you are finished.
Type following command to install etcd:
dnf -y install etcd
Configure etcd [p1,p2,p3 accordingly]
Edit the configuration file in p1
mv /etc/etcd/etcd.conf /etc/etcd/etcd.conf.orig vi /etc/etcd/etcd.conf
Add following configuration(Give IP address of current machine in all places, ETCD_INITIAL_CLUSTER should get IP addresses of p1,p2 and p3):
Add the following configuration and save.
# This is an example configuration file for an etcd cluster with 3 nodes
#specify the name of an etcd member
ETCD_NAME=p1
# Configure the data directory
ETCD_DATA_DIR="/var/lib/etcd/p1"
# Configure the listen URLs
ETCD_LISTEN_PEER_URLS="http://192.168.110.131:2380"
ETCD_LISTEN_CLIENT_URLS="http://192.168.110.131:2379"
# Configure the advertise URLs
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.110.131:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.110.131:2379"
# Configure the initial cluster
ETCD_INITIAL_CLUSTER="p1=http://192.168.110.131:2380,p2=http://192.168.110.132:2380,p3=http://192.168.110.133:2380"
# Configure the initial cluster state
ETCD_INITIAL_CLUSTER_STATE="new"
# Configure the initial cluster token
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
# Configure v2 API
ETCD_ENABLE_V2="true"
Edit the configuration file in p2
mv /etc/etcd/etcd.conf /etc/etcd/etcd.conf.orig
vi /etc/etcd/etcd.conf
Add the following configuration and save.
# This is an example configuration file for an etcd cluster with 3 nodes
#specify the name of an etcd member
ETCD_NAME=p2
# Configure the data directory
ETCD_DATA_DIR="/var/lib/etcd/p2"
# Configure the listen URLs
ETCD_LISTEN_PEER_URLS="http://192.168.110.132:2380"
ETCD_LISTEN_CLIENT_URLS="http://192.168.110.132:2379"
# Configure the advertise URLs
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.110.132:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.110.132:2379"
# Configure the initial cluster
ETCD_INITIAL_CLUSTER="p1=http://192.168.110.131:2380,p2=http://192.168.110.132:2380,p3=http://192.168.110.133:2380"
# Configure the initial cluster state
ETCD_INITIAL_CLUSTER_STATE="new"
# Configure the initial cluster token
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
# Configure v2 API
ETCD_ENABLE_V2="true"
Edit the configuration file in p3
mv /etc/etcd/etcd.conf /etc/etcd/etcd.conf.orig
vi /etc/etcd/etcd.conf
Add the following configuration and save.
# This is an example configuration file for an etcd cluster with 3 nodes
#specify the name of an etcd member
ETCD_NAME=p3
# Configure the data directory
ETCD_DATA_DIR="/var/lib/etcd/p3"
# Configure the listen URLs
ETCD_LISTEN_PEER_URLS="http://192.168.110.133:2380"
ETCD_LISTEN_CLIENT_URLS="http://192.168.110.133:2379"
# Configure the advertise URLs
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.110.133:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.110.133:2379"
# Configure the initial cluster
ETCD_INITIAL_CLUSTER="p1=http://192.168.110.131:2380,p2=http://192.168.110.132:2380,p3=http://192.168.110.133:2380"
# Configure the initial cluster state
ETCD_INITIAL_CLUSTER_STATE="new"
# Configure the initial cluster token
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
# Configure v2 API
ETCD_ENABLE_V2="true"
Now start the etcd on all the three machines [p1,p2,p3]
systemctl start etcd systemctl enable etcd systemctl status etcd
add below lines in the bash profile [p1,p2,p3]
p1=192.168.110.131 p2=192.168.110.132 p3=192.168.110.133 ENDPOINTS=$p1:2379,$p2:2379,$p3:2379
Check if etcd has been successfully working with
. .bash_profile etcdctl endpoint status --write-out=table --endpoints=$ENDPOINTS
The sample output should look like this
[root@p1 ~]# etcdctl endpoint status --write-out=table --endpoints=$ENDPOINTS +----------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS | +----------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | 192.168.110.131:2379 | 341a3c460c1c993a | 3.5.7 | 20 kB | true | false | 2 | 9 | 9 | | | 192.168.110.132:2379 | 4679fe0fcb37326d | 3.5.7 | 20 kB | false | false | 2 | 9 | 9 | | | 192.168.110.133:2379 | ab23bcc86cf3190b | 3.5.7 | 20 kB | false | false | 2 | 9 | 9 | | +----------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ [root@p1 ~]#
Set up keepalived for PostgreSQL 15 HA with Patroni
Install Keepalived [p1,p2,p3]
dnf -y install keepalived
Configure Keepalived [p1,p2,p3]
Edit the /etc/sysctl.conf file to allow the server to bind to the virtual IP address.
vi /etc/sysctl.conf
Add the net.ipv4_ip_nonlocal_bind=1 directive, which allows the server to accept connections for IP addresses that are not bound to any of its interfaces, enabling the use of a floating, virtual IP:
net.ipv4.ip_nonlocal_bind = 1 net.ipv4.ip_forward = 1
Save and close the editor when you are finished.
Type the following command to reload settings from config file without rebooting:
sudo sysctl --system sudo sysctl -p
Create /etc/keepalived/keepalived.conf and make necessary changes [p1,p2,p3]
Get the interface name of your network with ifconfig
[root@p1 ~]# ifconfig
ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.110.131 netmask 255.255.255.0 broadcast 192.168.110.255
Now edit the /etc/keepalived/keepalived.conf
vi /etc/keepalived/keepalived.conf
Add following configuration:
vrrp_script check_haproxy {
script "pkill -0 haproxy"
interval 2
weight 2
}
vrrp_instance VI_1 {
state MASTER
interface ens33
virtual_router_id 51
priority 101
advert_int 1
virtual_ipaddress {
192.168.110.140
}
track_script {
check_haproxy
}
}
Repeat the same in p2
vi /etc/keepalived/keepalived.conf
Add following configuration:
p2 file should look like below
vrrp_script check_haproxy {
script "pkill -0 haproxy"
interval 2
weight 2
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 51
priority 100
advert_int 1
virtual_ipaddress {
192.168.110.140
}
track_script {
check_haproxy
}
}
Repeat the same in p3
vi /etc/keepalived/keepalived.conf
Add following configuration:
p3 file should look like below
vrrp_script check_haproxy {
script "pkill -0 haproxy"
interval 2
weight 2
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 51
priority 100
advert_int 1
virtual_ipaddress {
192.168.110.140
}
track_script {
check_haproxy
}
}
Start the keepalived service on all the machines.
systemctl start keepalived systemctl enable keepalived systemctl status keepalived
Check on your (p1) node to see if your (ens33) network interface has config
ured with an additional shared IP (192.168.110.140) with the command below
ip addr show ens33
the output should look like this
[root@p1 ~]# ip addr show ens33 2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 00:0c:29:5c:75:6a brd ff:ff:ff:ff:ff:ff altname enp2s1 inet 192.168.110.131/24 brd 192.168.110.255 scope global dynamic noprefixroute ens33 valid_lft 1107sec preferred_lft 1107sec inet 192.168.110.140/32 scope global ens33 valid_lft forever preferred_lft forever inet6 fe80::20c:29ff:fe5c:756a/64 scope link noprefixroute valid_lft forever preferred_lft forever [root@p1 ~]#
and you cannot see 192.168.110.140 in p2 and p3, because it’s a floating IP and will be moved to p2 or p3 when p1 is unavailable.
[root@p2 ~]# ip addr show ens33 2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 00:0c:29:fc:55:9d brd ff:ff:ff:ff:ff:ff altname enp2s1 inet 192.168.110.132/24 brd 192.168.110.255 scope global dynamic noprefixroute ens33 valid_lft 1110sec preferred_lft 1110sec inet6 fe80::20c:29ff:fefc:559d/64 scope link noprefixroute valid_lft forever preferred_lft forever [root@p2 ~]#
[root@p3 ~]# ip addr show ens33 2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 00:0c:29:7d:46:36 brd ff:ff:ff:ff:ff:ff altname enp2s1 inet 192.168.110.133/24 brd 192.168.110.255 scope global dynamic noprefixroute ens33 valid_lft 1120sec preferred_lft 1120sec inet6 fe80::20c:29ff:fe7d:4636/64 scope link noprefixroute valid_lft forever preferred_lft forever [root@p3 ~]#
Set up HAProxy for PostgreSQL 15 HA with Patroni
Install HAProxy [p1,p2,p3]
dnf -y install haproxy
Configure HAProxy [p1,p2,p3]
Edit haproxy.cfg file on your first node and repeat the same in p2 and p3 and make necessary changes in configuration files
mv /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.orig vi /etc/haproxy/haproxy.cfg
Add following configuration:
global maxconn 1000 defaults mode tcp log global option tcplog retries 3 timeout queue 1m timeout connect 4s timeout client 60m timeout server 60m timeout check 5s maxconn 900 listen stats mode http bind *:7000 stats enable stats uri / listen primary bind 192.168.110.140:5000 option httpchk OPTIONS /master http-check expect status 200 default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions server p4 192.168.110.134:5432 maxconn 100 check port 8008 server p5 192.168.110.135:5432 maxconn 100 check port 8008 server p6 192.168.110.136:5432 maxconn 100 check port 8008 listen standby bind 192.168.110.140:5001 balance roundrobin option httpchk OPTIONS /replica http-check expect status 200 default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions server p4 192.168.110.134:5432 maxconn 100 check port 8008 server p5 192.168.110.135:5432 maxconn 100 check port 8008 server p6 192.168.110.136:5432 maxconn 100 check port 8008
copy the file to p2 and p3
[root@p1 ~]# scp /etc/haproxy/haproxy.cfg p2:/etc/haproxy/haproxy.cfg root@p2's password: haproxy.cfg 100% 1218 676.0KB/s 00:00 [root@p1 ~]# scp /etc/haproxy/haproxy.cfg p3:/etc/haproxy/haproxy.cfg root@p3's password: haproxy.cfg 100% 1218 491.5KB/s 00:00 [root@p1 ~]#
start haproxy service [p1, p2 and p3]
systemctl start haproxy systemctl enable haproxy systemctl status haproxy
Install PostgreSQL 15 with Patroni for PostgreSQL 15 HA with Patroni
Install PostgreSQL [p4, p5 and p6 ; optionally on p1]
Since I am going to have PostgreSQL client in p1 to test the configuration, I will install PostgreSQL in all the machines.
dnf install epel-release -y dnf --enablerepo=powertools install moreutils -y dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-8-x86_64/pgdg-redhat-repo-latest.noarch.rpm dnf -qy module disable postgresql dnf install -y postgresql15-server dnf install -y postgresql15-contrib dnf install -y postgresql15-devel
Install Patroni [p4,p5,p6]
yum install -y python3 python3-devel gcc sudo ln -s /usr/local/bin/pip /bin/pip pip3 install --upgrade pip pip install psycopg2-binary pip install python-etcd python3 -m pip install patroni[etcd] dnf -y install patroni dnf -y install patroni-etcd dnf -y install watchdog sudo ln -s /usr/local/bin/patronictl /bin/patronictl /usr/local/bin/patronictl –help
Configure Patroni [p4,p5,p6 accordingly]
Create a configuration file for Patroni on your first node (p4) in our case, like below:
mkdir -p /etc/patroni
vi /etc/patroni/patroni.yml
Edit the file with following content (p4)
scope: postgres namespace: /db/ name: p4 restapi: listen: 192.168.110.134:8008 connect_address: 192.168.110.134:8008 etcd3: hosts: 192.168.110.131:2379,192.168.110.132:2379,192.168.110.133:2379 bootstrap: dcs: ttl: 30 loop_wait: 10 retry_timeout: 10 maximum_lag_on_failover: 1048576 postgresql: use_pg_rewind: true pg_hba: - host replication replicator 127.0.0.1/32 md5 - host replication replicator 192.168.110.134/0 md5 - host replication replicator 192.168.110.135/0 md5 - host replication replicator 192.168.110.136/0 md5 - host all all 0.0.0.0/0 md5 postgresql: listen: 192.168.110.134:5432 connect_address: 192.168.110.134:5432 data_dir: /u01/pgsql/15 bin_dir: /usr/pgsql-15/bin authentication: replication: username: replicator password: replicator superuser: username: postgres password: postgres parameters: unix_socket_directories: '.' watchdog: mode: required device: /dev/watchdog safety_margin: 5 tags: nofailover: false noloadbalance: false clonefrom: false nosync: false
In p5, edit the file with following content
scope: postgres namespace: /db/ name: p5 restapi: listen: 192.168.110.135:8008 connect_address: 192.168.110.135:8008 etcd3: hosts: 192.168.110.131:2379,192.168.110.132:2379,192.168.110.133:2379 bootstrap: dcs: ttl: 30 loop_wait: 10 retry_timeout: 10 maximum_lag_on_failover: 1048576 postgresql: use_pg_rewind: true pg_hba: - host replication replicator 127.0.0.1/32 md5 - host replication replicator 192.168.110.134/0 md5 - host replication replicator 192.168.110.135/0 md5 - host replication replicator 192.168.110.136/0 md5 - host all all 0.0.0.0/0 md5 postgresql: listen: 192.168.110.135:5432 connect_address: 192.168.110.135:5432 data_dir: /u01/pgsql/15 bin_dir: /usr/pgsql-15/bin authentication: replication: username: replicator password: replicator superuser: username: postgres password: postgres parameters: unix_socket_directories: '.' watchdog: mode: required device: /dev/watchdog safety_margin: 5 tags: nofailover: false noloadbalance: false clonefrom: false nosync: false
In p6, edit the file with following content
scope: postgres namespace: /db/ name: p6 restapi: listen: 192.168.110.136:8008 connect_address: 192.168.110.136:8008 etcd3: hosts: 192.168.110.131:2379,192.168.110.132:2379,192.168.110.133:2379 bootstrap: dcs: ttl: 30 loop_wait: 10 retry_timeout: 10 maximum_lag_on_failover: 1048576 postgresql: use_pg_rewind: true pg_hba: - host replication replicator 127.0.0.1/32 md5 - host replication replicator 192.168.110.134/0 md5 - host replication replicator 192.168.110.135/0 md5 - host replication replicator 192.168.110.136/0 md5 - host all all 0.0.0.0/0 md5 postgresql: listen: 192.168.110.136:5432 connect_address: 192.168.110.136:5432 data_dir: /u01/pgsql/15 bin_dir: /usr/pgsql-15/bin authentication: replication: username: replicator password: replicator superuser: username: postgres password: postgres parameters: unix_socket_directories: '.' watchdog: mode: required device: /dev/watchdog safety_margin: 5 tags: nofailover: false noloadbalance: false clonefrom: false nosync: false
Enable Watchdog [p4,p5,p6]
Edit /etc/watchdog.conf to enable software watchdog:
vi /etc/watchdog.conf
Uncomment following line:
watchdog-device = /dev/watchdog
Execute following commands to activate watchdog:
mknod /dev/watchdog c 10 130 modprobe softdog chown postgres /dev/watchdog
For example.,
[root@p3 ~]# mknod /dev/watchdog c 10 130 [root@p3 ~]# modprobe softdog [root@p3 ~]# chown postgres /dev/watchdog [root@p3 ~]#
Add the necessary permissions for postgres user to create data directory [p4,p5,p6]
mkdir -p /u01 chown -R postgres:postgres /u01
start patroni on p4
systemctl start patroni
and the log should look like this
May 7 09:52:48 p4 systemd[1]: Started Runners to orchestrate a high-availability PostgreSQL. May 7 09:52:48 p4 patroni[8469]: 2023-05-07 09:52:48,925 INFO: Selected new etcd server http://192.168.110.133:2379 May 7 09:52:48 p4 patroni[8469]: 2023-05-07 09:52:48,944 INFO: No PostgreSQL configuration items changed, nothing to reload. May 7 09:52:49 p4 patroni[8469]: 2023-05-07 09:52:49,046 INFO: Lock owner: None; I am p4 May 7 09:52:49 p4 patroni[8469]: 2023-05-07 09:52:49,146 INFO: trying to bootstrap a new cluster May 7 09:52:49 p4 patroni[8482]: The files belonging to this database system will be owned by user "postgres". May 7 09:52:49 p4 patroni[8482]: This user must also own the server process. May 7 09:52:49 p4 patroni[8482]: The database cluster will be initialized with locale "en_US.UTF-8". May 7 09:52:49 p4 patroni[8482]: The default text search configuration will be set to "english". May 7 09:52:49 p4 patroni[8482]: Data page checksums are enabled. May 7 09:52:49 p4 patroni[8482]: creating directory /u01/pgsql/15 ... ok May 7 09:52:49 p4 patroni[8482]: creating subdirectories ... ok May 7 09:52:49 p4 patroni[8482]: selecting dynamic shared memory implementation ... posix May 7 09:52:49 p4 patroni[8482]: selecting default max_connections ... 100 May 7 09:52:49 p4 patroni[8482]: selecting default shared_buffers ... 128MB May 7 09:52:49 p4 patroni[8482]: selecting default time zone ... Asia/Kolkata May 7 09:52:49 p4 patroni[8482]: creating configuration files ... ok May 7 09:52:49 p4 patroni[8482]: running bootstrap script ... ok May 7 09:52:49 p4 patroni[8482]: performing post-bootstrap initialization ... ok May 7 09:52:49 p4 patroni[8482]: syncing data to disk ... ok May 7 09:52:49 p4 patroni[8482]: initdb: warning: enabling "trust" authentication for local connections May 7 09:52:49 p4 patroni[8482]: initdb: hint: You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb. May 7 09:52:49 p4 patroni[8482]: Success. You can now start the database server using: May 7 09:52:49 p4 patroni[8482]: /usr/pgsql-15/bin/pg_ctl -D /u01/pgsql/15 -l logfile start May 7 09:52:50 p4 patroni[8496]: 2023-05-07 09:52:50.238 IST [8496] LOG: redirecting log output to logging collector process May 7 09:52:50 p4 patroni[8496]: 2023-05-07 09:52:50.238 IST [8496] HINT: Future log output will appear in directory "log". May 7 09:52:50 p4 patroni[8469]: 2023-05-07 09:52:50,253 INFO: postmaster pid=8496 May 7 09:52:50 p4 patroni[8504]: 192.168.110.134:5432 - accepting connections May 7 09:52:50 p4 patroni[8506]: 192.168.110.134:5432 - accepting connections May 7 09:52:50 p4 patroni[8469]: 2023-05-07 09:52:50,344 INFO: establishing a new patroni connection to the postgres cluster May 7 09:52:50 p4 patroni[8469]: 2023-05-07 09:52:50,405 INFO: running post_bootstrap May 7 09:52:50 p4 patroni[8469]: 2023-05-07 09:52:50,440 INFO: Software Watchdog activated with 25 second timeout, timing slack 15 seconds May 7 09:52:50 p4 patroni[8469]: 2023-05-07 09:52:50,639 INFO: initialized a new cluster May 7 09:52:50 p4 patroni[8469]: 2023-05-07 09:52:50,737 INFO: no action. I am (p4), the leader with the lock
start patroni in p5 and p6
systemctl start patroni
on p5, you should see log like this
May 7 09:53:30 p5 systemd[1]: Started Runners to orchestrate a high-availability PostgreSQL.
May 7 09:53:30 p5 patroni[8964]: 2023-05-07 09:53:30,586 INFO: Selected new etcd server http://192.168.110.131:2379
May 7 09:53:30 p5 patroni[8964]: 2023-05-07 09:53:30,663 INFO: No PostgreSQL configuration items changed, nothing to reload.
May 7 09:53:30 p5 patroni[8964]: 2023-05-07 09:53:30,817 INFO: Lock owner: p4; I am p5
May 7 09:53:30 p5 patroni[8964]: 2023-05-07 09:53:30,868 INFO: trying to bootstrap from leader 'p4'
May 7 09:53:31 p5 patroni[8974]: WARNING: skipping special file "./.s.PGSQL.5432"
May 7 09:53:31 p5 patroni[8974]: WARNING: skipping special file "./.s.PGSQL.5432"
May 7 09:53:31 p5 patroni[8964]: 2023-05-07 09:53:31,628 INFO: replica has been created using basebackup
May 7 09:53:31 p5 patroni[8964]: 2023-05-07 09:53:31,630 INFO: bootstrapped from leader 'p4'
May 7 09:53:31 p5 patroni[8964]: 2023-05-07 09:53:31,951 INFO: postmaster pid=8982
May 7 09:53:31 p5 patroni[8982]: 2023-05-07 09:53:31.960 IST [8982] LOG: redirecting log output to logging collector process
May 7 09:53:31 p5 patroni[8982]: 2023-05-07 09:53:31.960 IST [8982] HINT: Future log output will appear in directory "log".
May 7 09:53:31 p5 patroni[8983]: 192.168.110.135:5432 - rejecting connections
May 7 09:53:31 p5 patroni[8989]: 192.168.110.135:5432 - rejecting connections
May 7 09:53:33 p5 patroni[8997]: 192.168.110.135:5432 - accepting connections
May 7 09:53:33 p5 patroni[8964]: 2023-05-07 09:53:33,070 INFO: Lock owner: p4; I am p5
May 7 09:53:33 p5 patroni[8964]: 2023-05-07 09:53:33,071 INFO: establishing a new patroni connection to the postgres cluster
May 7 09:53:33 p5 patroni[8964]: 2023-05-07 09:53:33,124 INFO: no action. I am (p5), a secondary, and following a leader (p4)
May 7 09:53:40 p5 patroni[8964]: 2023-05-07 09:53:40,822 INFO: no action. I am (p5), a secondary, and following a leader (p4)
Repeat the same in p6
Check patroni member nodes:
patronictl -c /etc/patroni/patroni.yml list
The output should look like this
[root@p4 pgsql]# patronictl -c /etc/patroni/patroni.yml list + Cluster: postgres -------+---------+---------+----+-----------+ | Member | Host | Role | State | TL | Lag in MB | +--------+-----------------+---------+---------+----+-----------+ | p4 | 192.168.110.134 | Leader | running | 1 | | | p5 | 192.168.110.135 | Replica | running | 1 | 0 | | p6 | 192.168.110.136 | Replica | running | 1 | 0 | +--------+-----------------+---------+---------+----+-----------+ [root@p4 pgsql]#
Check if you can connect from psql client
psql -h 192.168.110.140 -p 5000
and you should get something like this
[postgres@p1 ~]$ psql -h 192.168.110.140 -p 5000 psql (15.2) Type "help" for help. postgres=#
Create a sample table and we will use this table for failure scenarios
create table emp(id int, sal int);
Test switchover
[root@p4 pgsql]# patronictl -c /etc/patroni/patroni.yml switchover Current cluster topology + Cluster: postgres -------+---------+---------+----+-----------+ | Member | Host | Role | State | TL | Lag in MB | +--------+-----------------+---------+---------+----+-----------+ | p4 | 192.168.110.134 | Leader | running | 1 | | | p5 | 192.168.110.135 | Replica | running | 1 | 0 | | p6 | 192.168.110.136 | Replica | running | 1 | 0 | +--------+-----------------+---------+---------+----+-----------+ Primary [p4]: Candidate ['p5', 'p6'] []: p5 When should the switchover take place (e.g. 2023-05-07T11:01 ) [now]: Are you sure you want to switchover cluster postgres, demoting current leader p4? [y/N]: y 2023-05-07 10:02:03.39983 Successfully switched over to "p5" + Cluster: postgres -------+---------+---------+----+-----------+ | Member | Host | Role | State | TL | Lag in MB | +--------+-----------------+---------+---------+----+-----------+ | p4 | 192.168.110.134 | Replica | stopped | | unknown | | p5 | 192.168.110.135 | Leader | running | 1 | | | p6 | 192.168.110.136 | Replica | running | 1 | 0 | +--------+-----------------+---------+---------+----+-----------+ .. (after few seconds) .. [root@p4 pgsql]# patronictl -c /etc/patroni/patroni.yml list + Cluster: postgres -------+---------+---------+----+-----------+ | Member | Host | Role | State | TL | Lag in MB | +--------+-----------------+---------+---------+----+-----------+ | p4 | 192.168.110.134 | Replica | running | 2 | 0 | | p5 | 192.168.110.135 | Leader | running | 2 | | | p6 | 192.168.110.136 | Replica | running | 2 | 0 | +--------+-----------------+---------+---------+----+-----------+ [root@p4 pgsql]#
Created a script that continuously inserts data into table by creating a connection.
python3 HATest.py --port 5000
The log shows that my python script connected to 192.168.110.135 to execute insert statement and it experienced a connection lost when stopped the primary with systemctl stop patroni for few seconds and connected to new primary 192.168.110.136
We will cover the failure scenarios and more about HATest.py in detail in another post.
Connected to: 192.168.110.135 PostgreSQL server is not in recovery mode. Performing INSERT query... Data inserted successfully Count: 25 Press Ctrl+Z to terminating the job... Connected to: 192.168.110.135 PostgreSQL server is not in recovery mode. Performing INSERT query... Data inserted successfully Count: 26 Press Ctrl+Z to terminating the job... Connection attempt 1 failed. Retrying in 1 seconds... Connection attempt 2 failed. Retrying in 1 seconds... Connected to: 192.168.110.136 PostgreSQL server is not in recovery mode. Performing INSERT query... Data inserted successfully Count: 27 Press Ctrl+Z to terminating the job...
A simple ping test can also serve the purpose
Write test
while true;
do
echo "select inet_server_addr(),now()::timestamp" | psql -Upostgres -h192.168.110.140 -p5000 -t;
sleep 1;
done
Read test
while true;
do
echo "select inet_server_addr(),now()::timestamp" | psql -Upostgres -h192.168.110.140 -p5000 -t;
sleep 1;
done
Words from postgreshelp
Thank you for giving your valuable time to read the above information. I hope the content served your purpose in reaching out the blog.
Suggestions for improvement of the blog are highly appreciable. Please contact us for any information/suggestion/feedback.
If you want to be updated with all our articles
please follow us on Facebook | Twitter
Please subscribe to our newsletter.