Wednesday, March 20, 2013

How to install PGPool II on PostgreSQL Servers in master-slave architecture + PGPoolAdmin web managment


General Information

PGPool can run on same server along with PostgreSQL DB or on stand alone server(recommended). In this article we will install PGPool on stand alone server, but the only difference is connection ports on PGPool and PostgreSQL.
We will install PGPool II 3.1 on PostgreSQL 9.1.


Basic architecture:
┏───────────────┓
│                │
│    pgpool-1    │
│  pgpool server │
│                │
┗───────────────┛
//             \\
//               \\
//                 \\
\\//                \\//
\/                  \/
┏───────────────┓           ┏───────────────┓
│                │           │                │
│    pgsql-1     │ streaming │     pgsql-2    │
│  pgsql master  │══════════>│  pgsql slave   │
│     server     │replication│     server     │
│                │           │                │
┗───────────────┛           ┗───────────────┛


Fail cases:

Slave fails

In case slave server will fail PGPool will start failover.sh script and will mark server as Down (state 3). It'll reconnect all open connections to this server to the master server, at this case users that have been on this connection will be disconnection.
When the server will be fixed and started up you will need manually start recovery process. To do this the PostgreSQL need to be turned off on the server. Recovery process will take backup from the master and restore it on slave. When it will finish restore process PGPool will connect it back. Streaming replication process will restore all the data inserted since the backup. No data lost.

Master fails

In case master server will fail PGPool will start failover.sh script, that will notice that the failed server is master and set trigger to slave server. When slave server notice the trigger it will promote itself to master. All the connections to master will be reconnected to slave and user will be disconnected. All the data inserted to master and not yet replicated to slave will be lost. When the server will be fixed and started you will need manually start recovery process. To do this PostgreSQL need to be turned off on the server. Recovery process will take backup from the master and restore it on slave. When it will finish restore process PGPool will connect it back. Streaming replication process will restore all the data inserted since the backup.

PGPool fails

PGPool is SPOF and if it fails there will be no connection to client. To fix this problem you need to create PGPool HA cluster. This can be done with cluster software like Pacemaker or Linux Heartbeat.


PGPool II setup

PGPool can be installed from source of from OS packages, I recommended installing from source, as we will need some files from the tar package. Source files can be downloaded from PGPool official site: http://pgpool.net/mediawiki/index.php/Downloads#pgpool-II
Before compiling PGPool you might need to install missing packages:

  • For Debian/Ubuntu: libpq-dev
  • For RH/CentOS: postgresql-libs
Download needed version of PGPool, open and install:
$ tar xfz /some/where/pgpool-II-3.1.1.tar.gz
$ cd pgpool-II-3.1.1
$ ./configure
$ make
$ make install



You can configure PGPool install path using –prefix flag, in this article we will use default configurations, which will install PGPool configurations to /usr/local/share/etc
If you installing from OS software manager you can't configure the path, but you won't need installing missing packages. Default path will for configuration will be /etc/pgpool2/
After installing PGPool we need to do some basic configuration, here's the sample pgpool.conf file with basic configuration needed to run PGPool in master-slave with streaming replication:
link for configuration


Make sure you changing this configuration to ones you need:


  • PGPool client connections port
port = 5432

  • PostgreSQL Master server
backend_hostname0 = 'pgpool-1'

  • PostgreSQL connection port
backend_port0 = 5432

  • Load balancing weight
backend_weight0 = 0

  • PostgreSQL data path(on PostgreSQL server)
backend_data_directory0 = '/var/lib/postgresql/9.1/main'

  • PostgreSQL username and password (super user)
sr_check_user = 'postgres'
sr_check_password = ''

  • Health check for PostgreSQL nodes period in seconds
health_check_period = 10

  • PostgreSQL username and password (super user)
health_check_user = 'postgres'
health_check_password = ''

  • User to run recovery script on PostgreSQL server
recovery_user = 'postgres'
recovery_password = ''

Next we will setup user for PGPool Administration tool called PCP(later can be used in PGPoolAdmin). We need to encrypt user password to md5:
$ pg_md5 password


Edit pcp.conf and insert the new password along with username to the end of the file:
username:5f4dcc3b5aa765d61d8327deb882cf99


Next we need to add script for failover, that already configured in pgpool.conf. This script will check if the PostgreSQL server that failed is master or slave. In case of master fail it will put trigger on the slave and it will promote itself to master. Put this script at same location as all configuration files /usr/local/share/etc and give it execute permissions:
link to failover.sh
Create necessary directories
$ mkdir /var/run/pgpool
$ mkdir /var/log/pgpool

Set permissions to this directories for user that will run PGPool.
Now PGPool is fully configured and can be started. To start it just run PGPool command. By default it will run at same shell as your user and will print the log to STDOUT. Here is sample of command how to start PGPool as deamon and save the log to file:
$ pgpool -n -d > /tmp/pgpool.log 2>&1 &

You may not start the PGPool now if you want to install PGPoolAdmin.


PGPoolAdmin setup(optional)

PGPoolAdmin allows to manage PGPool: start/stop PGPool, edit configurations, add/remove/ recovery/promote PostgreSQL nodes.
To install PGPoolAdmin you will need to install first Apache service to the server(I hope you know how to do it).
Open the tar file and put it content into Apache data directory:
$ tar xfz /some/where/pgpoolAdmin-3.1.1.tar.gz
$ mv /some/where/pgpoolAdmin-3.1.1/pgpooladmin /var/www/html/

Now we need to set permissions to apache user to edit configuration files
$ chown apache /var/run/pgpool
$ chown apache /var/log/pgpool
$ chown apache /share/local/etc/pgpool.conf
$ chown apache /share/local/etc/pool_hba.conf
$ chown -R apache /var/www/html/pgpooladmin

PGPool recovery script must login to PostgreSQL server and create trigger file, to do it must login to server via ssh from user apache. To increase security you might want to set firewall to allow to connection only from PGPool server.
PGPoolAdmin is ready to installed and used. Open in you browser http://serverIP/pgpooladmin/install/index.php follow the steps. At the end PGPoolAdmin will start PGPool on the server. To login use PCP user and password configured before.


PostgreSQL setup

At this point we have PGPool running and accepting connections, but it might not work properly as PostgreSQL might not accept those connections. At this step we will re-configure your PostgreSQL server. This step must be done to all new node you want to add to PGPool, before running recovery process.
Edit postgresql.conf configuration file with following variables:
listen_address = '*'
hot_standby = on
wal_level = hot_standby
max_wal_senders = 1

Edit pg_hba.conf configuration file:
host all all 0.0.0.0/0 trust
host replication postgres 0.0.0.0/0 trust

This mean that the server will accept connections from all IP's without authentication for any user. It's possible to put your subnet mask or IP of PGPool server. Note, md5 authentication is not working with PGPool master-slave configuration.
Second allows to do streaming replication between any server without authentication for user postgres. This can be changed for needed subnet or IP of second server and if you want to use other user for replication(User must have REPLICATION or SUPERUSER permission on the second server).
Now we need to setup some PGPool recovery functions. If you have installed PGPool from source you have them inside tar file. If you installed it from software manager download the source file and follow this step.
Inside the source file you have sql directory. You need to compile it each one of them and run the compiled SQL file on postgres DB and template1. If you installed PostgreSQL for software manager you will need to install additional package:

  • For Debian/Ubuntu: postgresql-server-dev
  • For RH/CentOS: postgresql-devel
Run in each directory:
$ make install
$ psql -f pgpool-*****.sql postgres
$ pgsql -f pgpool-*****.sql template1

PGPool recovery process is using 2 scripts, one on master and second on slave. First script called basebackup.sh, it's executed from the master and creates backup, then it sends it to the slave. This script need to be in PostgreSQL data directory(default /var/lib/postgresql/9.1/main) with execute permissions Next user that will run recovery process on the nodes need to be able to the login without password to the each other(again, I hope you can do it by yourself).

Second script allows PGPool to start PostgreSQL service. This script uses pg_ctl command, check that the path is correct. If you don't have pg_ctl change the path to /etc/init.d/postgresql and remove from the command “-w -D $DESTDIR”.

Create trigger directory and give postgres permissions on it
$ mkdir -p /var/log/pgpool/trigger
$ chown postgres /var/log/pgpool/trigger

Finishing the setup

Now you should have running PGPool and configured PostgreSQL master server. Last step is to configure the slave and test everything. Do all the configuration on the slave same as on the master and at the end stop the PostgreSQL service. If you installed PGPoolAdmin login into the interface, go to “Edit pgpool.conf” and then to “Backends” and press “Add”. Fill all the info for new backend and change “backend_weight” on both backends to 1. Save the configuration and reload PGPool. When PGPool will discover new node press “Recovery” near it. After recovery


Have questions? Just contact us right away and we will be happy to assist


Powered by 123ContactForm | Report abuse

2 comments:

saritha N said...

Hi,I am new to this postgresql and pgpool.I have installed Binary package: postgresql-pgpool-II-3.2.0-1.el6.x86_64.rpm in my system .I am not finding any sql directory to install recovery process for pgpool-II.Please help me in resolving this.

Minh-Danh Ngo said...

It looks like the links to the sample config files are missing...