-
Notifications
You must be signed in to change notification settings - Fork 1
CKAN High Availability
This document describes a basic way to setup a CKAN high availability cluster. It is written primarily for CKAN 2.0 running on Ubuntu 12.04, but similar steps should work with all recent versions of CKAN. We first describe how the frontend components can be duplicated to provide redundancy, followed by suggestions for possible configuration of the main backend dependencies for CKAN 2.0 - PostgreSQL and Solr.
Redundancy on the frontend can be provided by having multiple web servers, each with their own copy of the CKAN code. As CKAN is a WSGI app any suitable web server setup can be used, but we generally recommend Apache with mod_wsgi.
A load balancer is then placed in front of the web servers (we recommend nginx). To configure nginx to load-balance the servers, create a file in the sites-available directory containing:
upstream backend {
ip_hash; # send requests from given IP to same server each time
server <server 1 IP>:<server 1 port> max_fails=3 fail_timeout=10s;
server <server 2 IP>:<server 2 port> max_fails=3 fail_timeout=10s;
}
server {
location / {
proxy_pass http://backend;
}
}
Notes:
- Each instance must have the same settings for
beaker.session.key
andbeaker.session.secret
in the CKAN config file. - If Open ID is used, something like Memcache will additionally be required in order to share session information between servers.
- Without Memcache it is also possible that some flash messages will not be displayed (as they are currently stored to disk), but the use of
ip_hash
should minimise this.
There are various ways that a PostgreSQL cluster can be configured, see [the PostgreSQL wiki] 1 for a brief overview. Here we are going to describe how to set up two PostgreSQL servers so that one acts as a master and the other is available as a [warm standby] 2 machine. Also see [these] 3 [documents] 4 for reference. The basic idea is that the CKAN instance(s) will all use the same master server. If a failure is detected, the standby server will be brought online and the instance(s) updated to use the new standby server. This process is not automatic, if automatic failover is required then a more complex setup (possibly using a tool like [slony] 5) will be required. The steps to setup the warm standby configuration are as follows:
On the master server:
-
As the postgres user, create a new ssh key (with no passphrase). This will be used to connect to the standby server.
sudo -u postgres mkdir /var/lib/postgresql/.ssh
sudo -u postgres chmod 700 /var/lib/postgresql/.ssh
sudo -u postgres ssh-keygen -t rsa -b 2048 -f /var/lib/postgresql/.ssh/rsync-key
On the standby server:
- Copy the newly created public key to the
authorized_keys
file of the postgres user on the standby server. In Ubuntu 12.04 this defaults to/var/lib/postgresql/.ssh/authorized_users
.
On the master server:
-
Verify that you can connect to the standby server as the postgres user.
sudo -u postgres ssh postgres@<standby server ip> -i /var/lib/postgresql/.ssh/rsync-key
-
Edit the file
/etc/postgresql/9.1/main/postgresql.conf
: On line 153 setwal_level = archive
. On line 181 setarchive_mode = on
. On line 183 setarchive_command = 'rsync -avz -e "ssh -i /var/lib/postgresql/.ssh/rsync-key" %p postgres@<standby server IP>:/var/lib/postgresql/9.1/archive/%f'
(where/var/lib/postgresql/9.1/archive
is where the WAL files will be stored on the standby server). -
Restart postgres.
On the standby server:
- Stop PostgreSQL if it is currently running.
- Remove the data directory:
sudo mv /var/lib/postgresql/9.1/main /var/lib/postgresql/9.1/main.backup
On the master server:
- Save a backup of the current database to the standby server.
sudo -u postgres psql -c "select pg_start_backup('ckan-initial-backup', true);"
sudo -u postgres rsync -avz -e "ssh -i /var/lib/postgresql/.ssh/rsync-key" --exclude 'pg_log/*' --exclude 'pg_xlog/*' --exclude postmaster.pid /var/lib/postgresql/9.1/main/ postgres@<standby server IP>:/var/lib/postgresql/9.1/main
sudo -u postgres psql -c "select pg_stop_backup();"
On the standby server:
-
Install the
posgresql-contrib
package to get thepg_standby
program.sudo apt-get install postgresql-contrib-9.1
-
Create a file in the postgres data directory called
recovery.conf
, containing:restore_command = '/usr/lib/postgresql/9.1/bin/pg_standby -t /var/lib/postgresql/9.1/recovery.trigger /var/lib/postgresql/9.1/archive/ %f %p %r'
/var/lib/postgresql/9.1/recovery.trigger
is the path to the trigger file, creating this file will cause the standby server to come online. -
Start postgres.
On the master server:
- Add some data (enough to create several WAL files).
On the standby server:
- Verify that the WAL files are being stored in the
/var/lib/postgresql/9.1/archive
directory (or equivalent). - Read the postgres log file to verify that the WAL files are being read.
Solr replication is described on the [Solr wiki] 6. The steps necessary to set up replication using a single-core master server and a single slave server are as follows:
On the master server:
-
Edit
/etc/solr/conf/solrconfig.xml
, adding a new replication request handler on around line 505.<requestHandler name="/replication" class="solr.ReplicationHandler"> <lst name="master"> <str name="replicateAfter">commit</str> <str name="replicateAfter">startup</str> </lst> </requestHandler>
-
Restart Jetty.
On the slave server:
-
Edit
/etc/solr/conf/solrconfig.xml
, adding a new replication request handler on around line 505.<requestHandler name="/replication" class="solr.ReplicationHandler" > <lst name="slave"> <!--fully qualified url for the replication handler of master--> <str name="masterUrl"><slave server IP>:<slave server port>/solr/replication</str> <!--Interval in which the slave should poll master. Format is HH:mm:ss--> <str name="pollInterval">00:00:20</str> </lst> </requestHandler>
-
Restart Jetty.