Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

draft setup xcat MN HA using shared data steps #6

Open
bybai opened this issue Sep 28, 2017 · 1 comment
Open

draft setup xcat MN HA using shared data steps #6

bybai opened this issue Sep 28, 2017 · 1 comment
Assignees

Comments

@bybai
Copy link
Owner

bybai commented Sep 28, 2017

This task is in order to save setup xcat MN HA using shared data steps command list.

@bybai bybai self-assigned this Sep 28, 2017
@bybai
Copy link
Owner Author

bybai commented Sep 28, 2017

This is my steps command list, it is not official doc now, will start official doc based on these content in next plan.

Take nfs based shared data as an example

nfs server:c910f05c01bc06 10.5.106.1
primary mn: bybc0609 10.5.106.9
secondary mn: bybc0605 . 10.5.106.5
use bybc0607 as test node, cn or sn: bybc0607 10.5.106.7
virtual ip address: 10.5.106.100
virtual hostname: byrhmn

configure primary and secondary xcat mn

on nfs server, export /HA:

[root@c910f05c01bc06 /]# cat /etc/exports|grep HA
/HA *(rw,no_root_squash,sync,no_subtree_check)
[root@c910f05c01bc06 /]# export -a
[root@c910f05c01bc06 /]# service nfs restart
Redirecting to /bin/systemctl restart nfs.service
[root@c910f05c01bc06 /]# showmount -e
Export list for c910f05c01bc06:
/HA *
[root@c910f05c01bc06 /]# mkdir /HA/etc/xcat
[root@c910f05c01bc06 /]# mkdir -p /HA/root/.xcat
[root@c910f05c01bc06 /]# mkdir -p /HA/install
[root@c910f05c01bc06 /]# mkdir -p /HA/var/lib/pgsql
[root@c910f05c01bc06 /]# mkdir -p /HA/tftpboot

on primary mn:

1, configure shared data:
[root@bybc0609 ~]# mkdir /etc/xcat
[root@bybc0609 ~]# mkdir /install/
[root@bybc0609 ~]# mkdir ~/.xcat
[root@bybc0609 ~]# mkdir /var/lib/pgsql
[root@bybc0609 ~]# mount -o rw c910f05c01bc06:/HA/etc/xcat /etc/xcat
[root@bybc0609 ~]# mount -o rw c910f05c01bc06:/HA/root/.xcat ~/.xcat
[root@bybc0609 ~]# mount -o rw c910f05c01bc06:/HA/install /install
[root@bybc0609 ~]# mount -o rw c910f05c01bc06:/HA/var/lib/pgsql /var/lib/pgsql
[root@bybc0609 data]# mount -o rw 10.5.106.1:/HA/tftpboot /tftpboot

[root@bybc0609 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/system-root 18G 981M 17G 6% /
devtmpfs 885M 0 885M 0% /dev
tmpfs 896M 0 896M 0% /dev/shm
tmpfs 896M 8.5M 888M 1% /run
tmpfs 896M 0 896M 0% /sys/fs/cgroup
/dev/sda1 509M 137M 373M 27% /boot
tmpfs 180M 0 180M 0% /run/user/0
c910f05c01bc06:/HA/etc/xcat 256G 233G 24G 91% /etc/xcat
c910f05c01bc06:/HA/root/.xcat 256G 233G 24G 91% /root/.xcat
c910f05c01bc06:/HA/install 256G 233G 24G 91% /install
c910f05c01bc06:/HA/var/lib/pgsql 256G 233G 24G 91% /var/lib/pgsql
c910f05c01bc06:/HA/tftpboot 256G 238G 18G 91% /tftpboot

2, install xCAT based on xCAT doc, add the two management nodes into policy table:
[root@bybc0609 x86_64]# tabedit policy
"1.2","bybc0609.cluster.com",,,,,,"trusted",,
"1.3","bybc0605.cluster.com",,,,,,"trusted",,
"1.4","byrhmn.cluster.com",,,,,,"trusted",,

3, switch database to postgresql
[root@bybc0609 x86_64]# chdef -t site databaseloc=/var/lib/pgsql
[root@bybc0609 x86_64]# yum -y install postgresql*
[root@bybc0609 x86_64]# yum -y install perl-DBD-Pg.x86_64
[root@bybc0609 x86_64]# pgsqlsetup -i -V

4, set up virtual ip address on primary mn, configure virtual ip address in /etc/hosts /etc/resolv.conf on primary mn, update /etc/nsswitch.conf
[root@bybc0609 ~]# ifconfig eth0:0 10.5.106.100 netmask 255.0.0.0
[root@bybc0609 ~]# ip address show eth0 |grep inet
inet 10.5.106.9/8 brd 10.255.255.255 scope global dynamic eth0
inet 10.5.106.100/8 brd 10.255.255.255 scope global secondary eth0:0
[root@bybc0609 ~]# cat /etc/resolv.conf
search cluster.com.
nameserver 10.5.106.100
[root@bybc0609 ~]# cat /etc/hosts|grep 09
10.5.106.100 bybc0609 bybc0609.cluster.com
[root@bybc0609 ~]# grep hosts /etc/nsswitch.conf
hosts: files dns myhostname

5, let xcat use virtual ip, change site table attribute master,nameserver,tftpserver etc, use following command to check;
lsdef -t site -l

6, configure xcatdb use postgresql.
add virtual ip in postgresql configure files pg_hba.conf and postgresql.conf , restart postgresql service, restart xcatd service.
[root@bybc0609 data]# cat /var/lib/pgsql/data/pg_hba.conf | grep host
host all all 10.5.106.9/32 md5
host all all 10.5.106.100/32 md5
host all all 10.5.106.5/32 md5
host all all 10.5.106.7/32 md5
host all all 127.0.0.1/32 trust
host all all ::1/128 trust
[root@bybc0609 data]# cat /var/lib/pgsql/data/postgresql.conf |grep listen
listen_addresses = 'localhost,10.5.106.9,10.5.106.100,10.5.106.5'

7, restart db and xcatd service
[root@bybc0609 ~]# hostname byrhmn
[root@bybc0609 data]#service postgresql restart
[root@bybc0609 data]#service xcatd restart

8, check db, for example, node definition, replace all 10.5.106.9 to 10.5.106.100,

9, stop xcatd and db service, in order to setup standby MN
[root@bybc0609 ~]# service xcatd stop
Stopping xcatd (via systemctl): [ OK ]
[root@bybc0609 ~]# service postgresql stop
Redirecting to /bin/systemctl stop postgresql.service
[root@bybc0609 ~]# ifconfig eth0:0 0.0.0.0 0.0.0.0

on Standby MN node:

1, install xcat using ip 10.5.106.5
2, add virtual ip 10.5.106.100
[root@bybc0605 x86_64]# ifconfig eth0:0 10.5.106.100 netmask 255.0.0.0
[root@bybc0605 x86_64]# grep 10.5.106.100 /etc/resolv.conf
nameserver 10.5.106.100
[root@bybc0605 x86_64]# grep 10.5.106.100 /etc/hosts
10.5.106.100 bybc0605 bybc0605.cluster.com
[root@bybc0605 x86_64]# grep hosts /etc/nsswitch.conf
hosts: files dns myhostname

3, change xcat use virtual ip 10.5.106.100, check site table attribute, then restart xcatd
[root@bybc0605 x86_64]# lsdef -t site -l
[root@bybc0605 x86_64]# service xcatd restart

4, Setup ssh authentication between the primary management node and standby management node. It should be setup as “passwordless ssh authentication” and it should work in both directions. The summary of this procedure is:

  1. cat keys from /.ssh/id_rsa.pub on the primary management node and add them to /.ssh/authorized_keys on the standby management node. Remove the standby management node entry from /.ssh/known_hosts on the primary management node prior to issuing ssh to the standby management node.
  2. cat keys from /.ssh/id_rsa.pub on the standby management node and add them to /.ssh/authorized_keys on the primary management node. Remove the primary management node entry from /.ssh/known_hosts on the standby management node prior to issuing ssh to the primary management node.

5, Make sure the time on the primary management node and standby management node is synchronized.

6, install postgresql:
[root@bybc0605 x86_64]# chdef -t site databaseloc=/var/lib/pgsql
[root@bybc0605 x86_64]# yum -y install postgresql*
[root@bybc0605 x86_64]# yum -y install perl-DBD-Pg.x86_64

7, stop xcatd
[root@bybc0605 .ssh]# service xcatd stop
[root@bybc0605 .ssh]# ifconfig eth0:0 0.0.0.0 0.0.0.0

8, back to primary mn, start postgresql and xcatd, using primary xcat MN

rsync ssh keys and /etc/hosts file
Add the following in current primary MN crontab:
0 1 * * * /usr/bin/rsync -Lprgotz $HOME/.ssh/id* bybc0605:$HOME/.ssh/
0 2 * * * /usr/bin/rsync -Lprogtz /etc/hosts bybc0605:/etc/

Failover

On the current primary management node:

1, Stop the xCAT daemon
[root@bybc0609 ~]# service xcatd stop
Stopping xcatd (via systemctl): [ OK ]
[root@bybc0609 ~]# service dhcpd stop
[root@bybc0609 ~]# service postgresql stop
Redirecting to /bin/systemctl stop postgresql.service

2, unexport the xCAT NFS directories
[root@bybc0609 ~]# exportfs -ua

3, Unmount shared data
[root@bybc0609 ~]# umount /etc/xcat
[root@bybc0609 ~]# umount /install
[root@bybc0609 ~]# umount ~/.xcat
[root@bybc0609 ~]# umount /tftpboot
[root@bybc0609 ~]# umount /var/lib/pgsql

4, unconfigure virtual ip
[root@bybc0609 ~]# ifconfig eth0:0 0.0.0.0 0.0.0.0

on new primary mn(original standby mn):

1, Configure Virtual IP:
[root@bybc0605 x86_64]# ifconfig eth0:0 10.5.106.100 netmask 255.0.0.0
[root@bybc0605 ~]# hostname byrhmn

2, mount shared data:
[root@byrhmn ~]# mount -o rw c910f05c01bc06:/HA/etc/xcat /etc/xcat
[root@byrhmn ~]# mount -o rw c910f05c01bc06:/HA/root/.xcat ~/.xcat
[root@byrhmn ~]# mount -o rw c910f05c01bc06:/HA/install /install
[root@byrhmn ~]# mount -o rw c910f05c01bc06:/HA/var/lib/pgsql /var/lib/pgsql
[root@byrhmn ~]# mount -o rw 10.5.106.1:/HA/tftpboot /tftpboot
[root@byrhmn ~]#mkdir -p /install/netboot/rhels7.4/x86_64/compute
[root@byrhmn ~]#mkdir -p /tmp/rootimg
[root@byrhmn ~]#ln -s /tmp/rootimg /install/netboot/rhels7.4/x86_64/compute

3, start postgresql, xcatd, dhcpd etc
[root@byrhmn ~]# service postgresql start
[root@byrhmn ~]# service xcatd start
[root@byrhmn ~]#makedns -n
[root@byrhmn ~]#makedhcp -n
[root@byrhmn ~]#makedhcp -a

Verification:

1. provision diskless cn directly

copycds RHEL-7.4-20170711.0-Server-x86_64-dvd1.iso
genimage rhels7.4-x86_64-netboot-compute
packimage rhels7.4-x86_64-netboot-compute
cp /tmp/rootimg.cpio.gz /install/netboot/rhels7.4/x86_64/compute/
chmod 644 /install/netboot/rhels7.4/x86_64/compute/rootimg.cpio.gz
cat bybc0607 |chdef -z
makehosts bybc0607
makedns -n
makedhcp -n
makedhcp -a
nodeset bybc0607 osimage=rhels7.4-x86_64-netboot-compute
rpower bybc0607 reset

2. provision sn

[root@byrhmn xcat]# chtab key=nameservers site.value=""
[root@byrhmn xcat]# chdef bybc0607 groups=service,all
1 object definitions have been created or modified.
[root@byrhmn xcat]# chdef -t group -o service profile=service primarynic=mac installnic=mac
1 object definitions have been created or modified.
[root@byrhmn xcat]# chdef -t group -o service setupnfs=1 setupdhcp=1 setuptftp=1 setupnameserver=1 setupconserver=1
1 object definitions have been created or modified.
[root@byrhmn xcat]# chdef -t group -o service nfsserver=byrhmn tftpserver=byrhmn xcatmaster=byrhmn monserver=byrhmn
1 object definitions have been created or modified.
[root@byrhmn xcat]# chtab node=service postscripts.postscripts="servicenode"
[root@byrhmn xcat]# chdef -t site clustersite installloc="/install"
[root@byrhmn xcat]# chdef -t site hierarchicalattrs="postscripts"
[root@byrhmn xcat]# chdef -t site clustersite sharedtftp=0
1 object definitions have been created or modified.
[root@byrhmn xcat]# chdef -t site clustersite installloc=
1 object definitions have been created or modified.
[root@byrhmn xcat]# mkdir -p /install/post/otherpkgs/rhels7.4/x86_64/xcat
[root@byrhmn /]# cp -r xcat-core /install/post/otherpkgs/rhels7.4/x86_64/xcat
[root@byrhmn /]# cp -r xcat-dep /install/post/otherpkgs/rhels7.4/x86_64/xcat
[root@byrhmn /]# ls /install/post/otherpkgs/rhels7.4/x86_64/xcat
xcat-core xcat-dep
[root@byrhmn sysconfig]# lsdef -t osimage rhels7.4-x86_64-install-service
Object name: rhels7.4-x86_64-install-service
imagetype=linux
osarch=x86_64
osdistroname=rhels7.4-x86_64
osname=Linux
osvers=rhels7.4
otherpkgdir=/install/post/otherpkgs/rhels7.4/x86_64
otherpkglist=/opt/xcat/share/xcat/install/rh/service.rhels7.x86_64.otherpkgs.pkglist
pkgdir=/install/rhels7.4/x86_64
pkglist=/opt/xcat/share/xcat/install/rh/service.rhels7.x86_64.pkglist
postscripts=servicenode
profile=service
provmethod=install
template=/opt/xcat/share/xcat/install/rh/service.rhels7.tmpl
[root@byrhmn sysconfig]#nodeset bybc0607 osimage=rhels7.4-x86_64-install-service
[root@byrhmn sysconfig]#rpower bybc0607 reset
[root@byrhmn sysconfig]#rsync -auv --exclude 'autoinst' /install bybc0607:/
[root@byrhmn sysconfig]#rsync -auv --exclude 'autoinst' /tftpboot bybc0607:/
[root@byrhmn httpd]# xdsh bybc0607 nodels
bybc0607: bybc0607
[root@byrhmn httpd]# xdsh bybc0607 tabdump networks
bybc0607: #netname,net,mask,mgtifname,gateway,dhcpserver,tftpserver,nameservers,ntpservers,logservers,dynamicrange,staticrange,staticrangeincrement,nodehostname,ddnsdomain,vlanid,domain,mtu,comments,disable
bybc0607: "10_0_0_0-255_0_0_0","10.0.0.0","255.0.0.0","eth0","10.5.106.2",,"",,,,,,,,,,,"1500",,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant