HA PEM using the C1 architecture with EFM and a Virtual IP v10

This page provides detailed instructions to install and configure a High Availability (HA) PEM deployment according to reference architecture C1.

This example uses EDB Failover Manager (EFM 5.0) for cluster management, EDB Postgres Advanced Server (EPAS 17) as the PEM backend database and Virtual IP (VIP) as the mechanism for routing traffic to the primary on RHEL like systems. Please see High Availability Patterns for PEM Deployment to understand other options.

Witness nodes vs standby nodes

In this example we configured a primary, two standbys, and a witness node. In reality, you only need a witness node if you have only two data nodes. If you are configuring three or more data nodes (e.g. a primary and two standbys), you may omit the steps pertaining to the witness node. If you are configuring only two data nodes, include the steps pertaining to the witness node.

The examples that follow use these IP addresses:

  • 172.16.161.200 - PEM Primary
  • 172.16.161.201 - PEM Standby 1
  • 172.16.161.202 - PEM Standby 2
  • 172.16.161.203 - EFM Witness Node
  • 172.16.161.245 - PEM VIP (used by agents and users to connect)

The following must use the VIP address:

  • The PEM agent binding of the monitored database servers
  • Accessing the PEM web client
  • Accessing the webserver services

Initial package installation and Postgres configuration

Perform the following steps on all nodes unless stated otherwise.

  1. Install the following packages:

    dnf -qy module disable postgresql
    dnf -y install epel-release
    dnf config-manager --set-enabled crb
    dnf -y install edb-as17-server edb-pem edb-as17-server-sslutils edb-efm50
  2. Initialize a Postgres database and start the service.

    PGSETUP_INITDB_OPTIONS="-E UTF-8" /usr/edb/as17/bin/edb-as-17-setup initdb
    systemctl start edb-as-17
    systemctl enable edb-as-17
  3. Open the following ports on the firewall:

    • 8443 for PEM Server (HTTPS)
    • 5444 for EPAS
    • 7800 for EFM

    For example:

    firewall-cmd --zone=public --add-port=5444/tcp --permanent 
    firewall-cmd --zone=public --add-port=8443/tcp --permanent
    firewall-cmd --zone=public --add-port=7800/tcp --permanent
    firewall-cmd --reload      
  4. On the standbys only, install the NGINX web server and add the pem user to the nginx group. This is not required on the primary because it occurs automatically during PEM configuration.

    dnf install nginx
    usermod -a -G pem nginx

User and access configuration on the primary

Perform the following steps on the primary.

  1. Create a superuser that can login using a password.

    su - enterprisedb -c "psql edb -c \"create role pemserver login superuser password 'your-password-here';\""
  2. Add the following line to the the pg_hba.conf file to permit the new user to connect from any of the server IPs. You may adjust the size of the subnet as appropriate to you network, but it must include all the PEM nodes.

    hostssl  all  pemserver  172.16.161.1/24  scram-sha-256
  3. Add the following line to the the pg_hba.conf file to permit other PEM users to connect to the PEM backend through the web application. You may adjust the size of the subnet as appropriate to you network, but it must include all the PEM nodes.

    hostssl all    +pem_user   172.16.161.201/24  scram-sha-256
  4. Restart the Postgres server.

    systemctl restart edb-as-17

Configure PEM on the primary

Configure the PEM installation on the primary server only:

  1. Manually assign the VIP to the primary. For example:

    /usr/edb/efm-5.0/bin/efm_address add4 eth0 172.16.161.245/32
  2. Run the PEM configuration script, specifying the VIP as the host and option 1 (Database and Web Services):

    /usr/edb/pem/bin/configure-pem-server.sh -t 1 -ho 172.16.161.245

    You will be prompted for various additional details. For configuration options see, Configuring the PEM server on Linux.

  3. Optionally, to synchronize PEM web application user preferences between instances, configure central storage of user preferences. This will be replicated to the standbys in the next step.

Copy PEM configuration to the standbys

During the previous step, various configuration files and directories were created on the primary. In this step, we will create the same files and directories on the standbys.

  1. Create the uWSGI directory structure:

    mkdir -p /etc/edb-uwsgi/apps-available
    mkdir -p /etc/edb-uwsgi/apps-enabled
    chmod 755 /etc/edb-uwsgi
    chmod 755 /etc/edb-uwsgi/apps-available
    chmod 755 /etc/edb-uwsgi/apps-enabled
    mkdir -p /var/log/edb-uwsgi
    mkdir -p /var/run/edb-uwsgi
    chmod 755 /var/log/edb-uwsgi
    chmod 755 /var/run/edb-uwsgi
    chown pem:pem /var/run/edb-uwsgi
    chown pem:pem /var/log/edb-uwsgi
  2. Copy the following files from the primary node to the standby nodes at the same location, overwriting any existing files.

    File PathDescription
    /usr/edb/pem/resources/server-pem.crtSelf-signed server certificate for the web application. You may replace this with another certificate either now or after deployment is complete.
    /usr/edb/pem/resources/server-pem.keyPrivate key of the web server certificate. This must match the certificate.
    /usr/edb/pem/share/.install-configUsed by PEM during upgrades. Copying this to the standbys means you can initiate a PEM database upgrade from whichever node is the primary at the time.
    /usr/edb/pem/web/config_setup.pyConfiguration file for the web application.
    /usr/edb/pem/web/config_local.pyLocal overrides for web application configuration. If not present on the primary, no action is needed.
    /etc/edb-uwsgi/uwsgi.iniuWSGI global configuration file.
    /usr/lib/tmpfiles.d/edb-uwsgi.confuWSGI configuration file for the PEM web application.
    /etc/systemd/system/edb-uwsgi.serviceuWSGI service file for the PEM web application
    /etc/edb-uwsgi/apps-available/pem.iniuWSGI configuration file for the PEM web application
    /etc/nginx/conf.d/edb-pem.confNGINX configuration for the PEM web application.
  3. On the standbys, reload the NGINX and uWSGI services with the new configuration:

    systemctl daemon-reload
    systemctl enable edb-uwsgi
    systemctl start edb-uwsgi
    systemctl restart nginx

Set up the primary node for streaming replication

  1. Create the replication role:

    psql -h 172.16.161.200 -p 5444 -U enterprisedb edb -c “CREATE ROLE repl REPLICATION LOGIN PASSWORD 'password';

    Give the password of your choice.

  2. Configure the following in the postgresql.conf file:

    wal_level = replica
    max_wal_senders = 10
    wal_keep_size = 500
    max_replication_slots = 10

    For more information on configuring parameters for streaming replication, see the PostgreSQL documentation.

  1. Add the following entry in the host-based authentication (/var/lib/edb/as17/data/pg_hba.conf) file to allow the replication user to connect from all the standbys:

    hostssl  replication  repl  172.16.161.201/24  scram-sha-256
  2. Restart the EPAS server.

    systemctl restart edb-as-17.service

Set up the standby nodes for streaming replication

Use the pg_basebackup utility to create replicas of the PEM backend database server on the standby servers.

  1. Stop the service for EPAS on all the standby nodes:

    systemctl stop edb-as-17.service
  2. Remove the data directory of the database server on all the standby nodes:

    su - enterprisedb
        
    rm -rf /var/lib/edb/as17/data/*
  3. Create the .pgpass file in the home directory of the enterprisedb user on all the standby nodes and add the following content. Replace <password> with the password of the replication user created previously.

    172.16.161.200:5444:replication:repl:<password>
    172.16.161.201:5444:replication:repl:<password>
    172.16.161.202:5444:replication:repl:<password>
  4. Set the permissions on the file to restrict access

    chmod 600 ~/.pgpass
  5. Take a backup of the primary node on each of the standby nodes using pg_basebackup:

    su - enterprisedb /usr/edb/as17/bin/pg_basebackup -h 172.16.161.200 \
    -D /var/lib/edb/as17/data -U repl -v -P -Fp -R -p 5444
  6. In the postgresql.conf file on each of the standby nodes, edit the following parameter:

    hot_standby = on
  7. Start the EPAS database server on each of the standby nodes:

    systemctl start edb-as-17

Configure SELinux

On all nodes, run the configure-selinux.sh script to configure the SELinux policy for PEM.

/usr/edb/pem/bin/configure-selinux.sh

Register agents and servers on the standbys

On each standby, perform the following steps.

  1. Register the PEM agent. Specify the VIP as the PEM server host and enable alert and notification handling. For example:

    export PEM_SERVER_PASSWORD=password
    /usr/edb/pem/agent/bin/pemworker --register-agent \
                                     --pem-server 172.17.0.12 \
                                     --pem-user pemserver \
                                     --pem-port 5444 \
                                     -o alert_threads=1 \
                                     --enable-snmp true \
                                     --enable-webhook true \
                                     --enable-smtp true \
                                     --max-webhook-retries 3

    See Registering a PEM Agent for more information.

  2. Register the Postgres instance with PEM. The following command means the PEM web application will use the server's external IP when making a client connection to the database, but the PEM Agent will use the loopback interface to connect locally.

    export PEM_SERVER_PASSWORD=password
    /usr/edb/pem/agent/bin/pemworker --register-server \
                                     --pem-user pemserver \
                                     --server-addr 172.16.161.201 \
                                     --server-port 5444 \
                                     --server-database edb \
                                     --server-user pemserver \
                                     --server-service-name edb-as-17 \
                                     --asb-host-name localhost  

    See Registering a Postgres Server for more information

    1. Execute the following SQL on the pem database as a superuser, providing the correct server and port for each.
    SELECT * pem.register_pem_server(server_id) FROM pem.server WHERE server='172.16.161.201' and port=5444;
    Info

    In older versions of PEM, the PEM server and its local agent had to have ID 1. This is no longer the case from PEM 10.1. Instead this SQL flags this server and agent as belonging to a PEM deployment, which in turn enables important system jobs such as purging expired data when this server is the primary.

Set up EFM to manage failover

Perform the following steps to set up EFM:

  1. On the primary, create a database user efm to connect to the database servers. Grant execute privileges on the functions related to WAL logs, and monitoring privileges, to the user. As a superuser:

    CREATE ROLE efm LOGIN PASSWORD 'password';
    
    -- Give privilege to 'efm' user to connect to a database
    GRANT CONNECT ON DATABASE edb TO efm;
    
    -- Give privilege to 'efm' user to do backup operations
    GRANT EXECUTE ON FUNCTION pg_current_wal_lsn() TO efm;
    GRANT EXECUTE ON FUNCTION pg_last_wal_replay_lsn() TO efm;
    GRANT EXECUTE ON FUNCTION pg_wal_replay_resume() TO efm;
    GRANT EXECUTE ON FUNCTION pg_wal_replay_pause() TO efm;
    GRANT EXECUTE ON FUNCTION pg_reload_conf() TO efm;
    
    -- Grant monitoring privilege to the 'efm' user
    GRANT pg_monitor TO efm;

    This change will be replicated to the standbys.

  2. On all nodes, add entries in pg_hba.conf to allow the efm database user to connect to the database server from all nodes on all the hosts.

    hostssl      edb     efm     172.16.161.200/32     scram-sha-256
    hostssl      edb     efm     172.16.161.201/32     scram-sha-256
    hostssl      edb     efm     172.16.161.202/32     scram-sha-256
    hostssl      edb     efm     172.16.161.203/32     scram-sha-256
  3. Reload the configurations on all the database servers.

    SELECT pg_reload_conf();
  4. On all nodes, create an efm.nodes file using the sample file (/etc/edb/efm-5.0/efm.nodes.in), and give read-write access to the EFM OS user:

    cp /etc/edb/efm-5.0/efm.nodes.in /etc/edb/efm-5.0/efm.nodes
    chown efm:efm /etc/edb/efm-5.0/efm.nodes
    chmod 600 /etc/edb/efm-5.0/efm.nodes
  5. On the standby nodes, add the IP address and EFM port of the primary node in the /etc/edb/efm-5.0/efm.nodes file:

  6. On all nodes, create the efm.properties file using the sample file (/etc/edb/efm-5.0/efm.properties.in). Grant read access to all users:

    cp /etc/edb/efm-5.0/efm.properties.in /etc/edb/efm-5.0/efm.properties
    chown efm:efm /etc/edb/efm-5.0/efm.properties
    chmod a+r /etc/edb/efm-5.0/efm.properties
  7. On any node, encrypt the efm database user's password (as defined in Step 1 above) using the efm utility, make a note of the output for the next step.

    export EFMPASS=password
    /usr/edb/efm-5.0/bin/efm encrypt efm --from-env
  8. On all nodes, edit the following parameters in the efm.properties file. Replace <encrypted-password> with the output of the previous step. Replace <ip-addr> with the IP address of each node.

    Note

    If your hosts are not connected to the internet, replace the value of ping.server with the address of a reliable server accessible on your network that will respond to pings.

    For more detail on EFM configuration please refer to the documentation.

    db.user=efm
    db.password.encrypted=<encrypted-password>
    db.port=5444
    db.database=edb
    db.service.owner=enterprisedb
    db.service.name=edb-as-17
    db.bin=/usr/edb/as17/bin
    db.data.dir=/var/lib/edb/as17/data
    jdbc.sslmode=require
    user.email=username@example.com
    from.email=node1@efm-pem
    notification.text.prefix=[PEM/EFM] 
    bind.address=<ip-addr>:7800
    is.witness=false
    encrypt.agent.messages=true
    stop.isolated.primary=true
    stop.failed.primary=true
    primary.shutdown.as.failure=false
    ping.server.ip=8.8.8.8
    
    # VIP configuration
    virtual.ip=172.16.161.245
    virtual.ip.interface=ens33
    virtual.ip.prefix=24
    virtual.ip.single=true
    check.vip.before.promotion=true
    
  9. On the witness node, set the value of the is.witness configuration parameter to true:

        is.witness=true
  10. On the primary node, enable and start the EFM service:

    systemctl enable edb-efm-5.0
    systemctl start edb-efm-5.0
  11. On the primary node, allow the standbys to join the cluster:

    /usr/edb/efm-5.0/bin/efm allow-node  efm  172.16.161.201
    /usr/edb/efm-5.0/bin/efm allow-node  efm  172.16.161.202
    /usr/edb/efm-5.0/bin/efm allow-node  efm  172.16.161.203
  12. Enable and start the EFM service on the standby nodes and the EFM witness node:

    systemctl enable edb-efm-5.0
    systemctl start edb-efm-5.0
  13. Check the EFM cluster status by running the following command on any node.

    /usr/edb/efm-5.0/bin/efm cluster-status efm

    The output should look like this:

    Cluster Status: efm
        Agent Type  Address              DB       VIP
        ----------------------------------------------------------------
        Primary     172.16.161.200      UP       172.16.161.245*
        Standby     172.16.161.201      UP       172.16.161.245
        Standby     172.16.161.202      UP       172.16.161.245
        Witness     172.16.161.203      N/A      172.16.161.245
    
    Allowed node host list:
        172.16.161.200 172.16.161.201 172.16.161.202 172.16.161.203
    
    Membership coordinator: 172.16.161.200
    
    Standby priority host list:
        172.16.161.201 172.16.161.202
    
    Promote Status:
    
        DB Type     Address              WAL Received LSN   WAL Replayed LSN   Info
        ---------------------------------------------------------------------------
        Primary     172.16.161.200                         0/F7A3808          
        Standby     172.16.161.201      0/F7A3808          0/F7A3808          
        Standby     172.16.161.202      0/F7A3808          0/F7A3808          
    
        Standby database(s) in sync with primary. It is safe to promote.

This status confirms that EFM is set up successfully and managing the failover for the PEM server.

In case of failover, any of the standbys are promoted as the primary node, and PEM agents connect to the new primary node. You can replace the failed primary node with a new standby using the procedure above.