In case when initiate cluster one by one node. Master can move to slave. #90

alexaht · 2017-04-24T09:43:46Z

Hello,

We are trying to use HA cluster using PAF script for postgres. However we are reached some times such behaviour after adding new node to cluster as slave, this slave after initiating making as master server.

Is it possible to avoid such master move?

Thanks for advise

ioguix · 2017-04-24T13:50:28Z

Hello,

You probably want to set a resource-stickiness to avoid the master resource the move around the place when you add a new node.

In the quick start guide of PAF, we set the default resource-stickiness to 10 and it has proven to be enough during our validations and tests...so far.

see http://dalibo.github.io/PAF/Quick_Start-CentOS-7.html#cluster-resource-creation-and-management

alexaht · 2017-04-24T13:53:47Z

Hello ioguix,

I have used this guide line when configuring HA cluster with PAF script, however when testing work of our service couple of times master has migrated to newly added node. In configuration resource-stickiness is set to 10.

Increasing this value can harm pgsqlms script work?

ioguix · 2017-04-24T15:01:01Z

This is strange. How do you add a node? Do you follow the guide on PAF website as well? Could you provide your setup and cluster scores before/after you add a node?

alexaht · 2017-04-24T15:29:08Z

First of all first node I'll init one node with such parameters.

pcs cluster auth node1 -u hacluster
pcs cluster setup --name VCS-Cluster node1
pcs cluster start node1
pcs -f cluster.xml resource defaults migration-threshold=3
pcs -f cluster.xml resource defaults resource-stickiness=10
pcs -f cluster.xml resource create pgsqld ocf:heartbeat:vcs-pgsqlms' \
            '\tbindir=%s pgdata=%s' \
            '\top start timeout=60s' \
            '\top stop timeout=60s' \
            '\top promote timeout=120s' \
            '\top demote timeout=120s' \
            '\top monitor interval=15s timeout=10s role="Master"' \
            '\top monitor interval=16s timeout=10s role="Slave"' \
            '\top notify timeout=120s'
pcs -f cluster.xml resource master pgsql-ha pgsqld notify=true
pcs -f cluster.xml resource create pgsql-master-ip ocf:heartbeat:IPaddr2' \
            ip=1.1.1.1 cidr_netmask=24 nic=ens18 op monitor interval=10s
pcs -f cluster.xml  constraint colocation add pgsql-master-ip with master pgsql-ha INFINITY
pcs -f cluster.xml constraint order promote pgsql-ha then start pgsql-master-ip symmetrical=false
pcs -f cluster.xml constraint order demote pgsql-ha then stop pgsql-master-ip symmetrical=false
pcs -f cluster.xml resource create apache ocf:heartbeat:apache' \
            '\tconfigfile=/etc/httpd/conf/httpd.conf statusurl="http://localhost/server-status"' \
            '\top monitor interval=1min
pcs -f cluster.xml constraint colocation add apache with master pgsql-ha INFINITY
pcs -f cluster.xml constraint order promote pgsql-ha then start apache symmetrical=false
pcs -f cluster.xml constraint order demote pgsql-ha then stop apache symmetrical=false
pcs -f cluster.xml property set stonith-enabled=false
pcs cluster cib-push cluster.xml

As pacemaker documentation said number of nodes pacemaker is setting automatically for number of nodes in cluster.

After adding second node
pcs cluster node add node2 --start
we are updating manually clone-max with value of exactly number of master and slave node, because we can have also arbiter node with out resources, where all resources banned.
pcs resource update pgsql-ha clone-max=2

Scores after initiating cluster with one node:
node1 - master-pgsqld 1001
After adding second node:
node1 - master-pgsqld 1001
node2 - master-pgsqld 1000
After adding third node:
node1 - master-pgsqld 1001
node2 - master-pgsqld 1000
node3 - master-pgsqld 990

I have repeat setup of cluster a lot of times and only couple of times master has migrate to new node.

alexaht · 2017-04-26T11:02:45Z

update.

any advice?

ioguix · 2017-05-02T16:37:56Z

Hey,

Sorry the delay, I'm quite busy this week...I might not be able to answer before 1 or 2 weeks, especialy if I need to build the cluster myself and poke around :/

In the meantime, please, provide as many information as you can: log files, timestamps to look at, scenario + commands, etc.

alexaht · 2017-05-03T15:22:24Z

Hey Jehan,

Unfortunately, logs for a period when a master has migrated to a newly added node has gone. If I reproduce this behaviour, I'll make log dumps and add an update in reply.

Many thanks for your awesome work.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In case when initiate cluster one by one node. Master can move to slave. #90

In case when initiate cluster one by one node. Master can move to slave. #90

alexaht commented Apr 24, 2017

ioguix commented Apr 24, 2017

alexaht commented Apr 24, 2017

ioguix commented Apr 24, 2017

alexaht commented Apr 24, 2017 •

edited

Loading

alexaht commented Apr 26, 2017

ioguix commented May 2, 2017

alexaht commented May 3, 2017

In case when initiate cluster one by one node. Master can move to slave. #90

In case when initiate cluster one by one node. Master can move to slave. #90

Comments

alexaht commented Apr 24, 2017

ioguix commented Apr 24, 2017

alexaht commented Apr 24, 2017

ioguix commented Apr 24, 2017

alexaht commented Apr 24, 2017 • edited Loading

alexaht commented Apr 26, 2017

ioguix commented May 2, 2017

alexaht commented May 3, 2017

alexaht commented Apr 24, 2017 •

edited

Loading