DOCUMENTATION

synctool by Walter de Jong <walter@heiho.net> (c) 2003-2012

synctool COMES WITH NO WARRANTY. synctool IS FREE SOFTWARE.
synctool is distributed under terms described in the GNU General Public
License.

See the README for general information.
See the INSTALL file for information on how to deploy synctool.


This DOCUMENTATION file tries to explain using synctool.

Online documentation in HTML pages are available at:
http://www.heiho.net/synctool/


In 'synctool terminology', a node is a host, a computer in a group of
computers. A group of computers is called a cluster.


Once you've installed and setup synctool, you will have a 'masterdir'
repository on the master node of your cluster. This masterdir is configured
in the synctool.conf file. If you don't know what I'm talking about here,
read the INSTALL file carefully. The masterdir is usually set to
/var/lib/synctool

synctool is usually run on the master node.
It will use rsync to mirror the masterdir to all the nodes in the cluster.
This has the advantage that you don't need a shared filesystem to run
synctool, and it is fast too.
Note that it is perfectly possible to run synctool.py "stand alone" on a
node, in which case it will check it's local copy of the repository. It will
not synchronize with the master repository (it's all server push, NOT client
pull).

The main power of synctool is the fact that you can define logical groups,
and you can add these to a filename as a filename extension. This will result
in the file being copied, only if the node belongs to the same group.
The groups a node is in, is defined in the synctool.conf file. In the
configuration file, the nodename is associated with one or more groups.
The nodename itself can also be used as a group to indicate that a file
belongs to that node.

In the synctool masterdir there are 5 subdirectories, each having its own
function:

    * overlay/
    * delete/
    * scripts/
    * tasks/
	* sbin/

The overlay/ tree contains files that have to be copied. When synctool
detects a difference between a file on the system and a file in the overlay
tree, the file will be copied from the overlay tree onto the system.

The delete/ tree contains files that always have to be deleted from the
system. Only the filename matters, so it is OK if the files in this tree are
only 0 bytes in size.

The executables in the scripts/ directory are executables that synctool can
run when needed. By means of the on_update directive in the synctool.conf
file, a designated script may be executed when a certain file is changed.
For example: when /etc/inetd.conf is updated, the script 'hupdaemon.sh inetd'
must be run.

The executables in de tasks/ directory are run when synctool is invoked with
the '-t' or '--tasks' argument. This makes it possible to run scripts on
hosts, which is very convenient for doing change management.

The sbin/ directory contains the synctool programs. Isn't this an odd place
to put 'binaries'? Yes, but these are the 'master copies' of the synctool
software. These are also synced to every node so that synctool can run
there.

Example run:

  root@masternode:/# synctool -qf
  node3: /etc/xinetd.d/identd updated (file size mismatch)

The file is being updated because there's a mismatch in the file size.
Should the file size be the same, synctool will calculate an MD5 checksum to
see whether the file was changed or not.

By default synctool does a DRY RUN. It will not do anything but show what
would happen if this would not be a dry run. Use -f or --fix to apply any
changes.

Now, I want the xinetd to be automatically reloaded after I change the identd
file. There are two ways to do this in synctool.

1. Old, classic way, the way it was done in synctool version <= 3.0

   Put in synctool.conf:

   on_update  /etc/xinetd.d/identd    /etc/init.d/xinetd reload


2. Modern way, synctool version >= 4.0, much easier:

   Put in file $masterdir/overlay/etc/xinetd.d/identd.post :

   /etc/init.d/xinetd reload


   The .post script will be run when the file changes:

   root@masternode:/# synctool -qf
   node3: /etc/xinetd.d/identd updated (file size mismatch)
   node3: running command $masterdir/overlay/etc/xinetd.d/identd.post

It is possible to put a group extension on the .post script, so that you
can have one group of nodes perform different actions than others.

The example for /etc/xinetd.d is interesting because you can also put a
"on_update" trigger or .post script on the directory itself. Whenever a file
in the directory gets modified, the trigger will be called. So, we can
simplify the situation for /etc/xinetd.d to:

    on_update   /etc/xinetd.d    /etc/init.d/xinetd reload

or

$masterdir/overlay/etc/xinetd.d.post:
    /etc/init.d/xinetd reload


The next example shows that the nodename can be used as a group.
In the example, the 'fstab' file is identical throughout the cluster, with
the exception of node5 and node7.

  root@masternode:/# ls -F /var/lib/synctool/overlay/etc
  ...
  fstab._all          motd.production._batch  sudoers._all
  fstab._node5        nsswitch.conf._all      sysconfig/
  fstab._node7        nsswitch.conf.old._all  sysctl.conf._all
  ...

The group 'all' implictly applies to all nodes. Likewise, there is an
implicit group 'none' that applies to no nodes. Group 'none' can be
convenient when you to have a copy of a file around, but do not wish to push
it to any nodes (yet).

The '-v' option gives verbose output. This is another way of displaying the
logic that synctool performs:

  # synctool -v
  ...
  node3: checking $masterdir/overlay/etc/tcpd_banner.production._all
  node3: overridden by $masterdir/overlay/etc/tcpd_banner.production._batch
  node3: checking $masterdir/overlay/etc/issue.net.production._all
  node3: checking $masterdir/overlay/etc/syslog.conf._all
  node3: checking $masterdir/overlay/etc/issue.production._all
  node3: checking $masterdir/overlay/etc/modules.conf._all
  node3: checking $masterdir/overlay/etc/hosts.allow.production._interactive
  node3: skipping $masterdir/overlay/etc/hosts.allow.production._interactive, it is not one of my groups
  ...

The '-q' option of synctool gives less output. This is my favorite option.

  root@masternode:/# synctool -q
  node3: /etc/xinetd.d/identd updated (file size mismatch)
  node3: running command /bin/rm /etc/xinetd.d/*.saved ; /etc/init.d/xinetd reload

If '-q' still gives too much output, because you have many nodes in your
cluster, it is possible to specify '-a' to condense (aggregate) output.
The condensed output groups together output that is the same for many nodes.
synctool -qa is one of my favorite commands.
synctool does this by piping the output through the synctool-aggr command.
You may also use this to condense output from dsh, for example

   # dsh date | synctool-aggr

or just

    # dsh -a date

    # dsh-ping -a

The '-f' or '--fix' option applies all changes. Always be sure to run synctool
at least once as a dry run! (without -f).
Mind that synctool does not lock the repository and does not guard against
concurrent use by multiple sysadmins at once. In practice, this hardly ever
led to any problems.

To update only a single file rather than all files, use the option
'--single' or '-1'.

If you want to check what file synctool is using for a given destination
file, use the '-r' or '--ref' option:

  root@masternode:/# synctool -q -n node1 -r /etc/resolv.conf
  node1: /etc/resolv.conf._somegroup


To inspect differences between the master copy and the client copy of a file,
use '--diff' or '-d'.

synctool can be run on a subset of nodes, a group, or even on individual
nodes using the options '--group' or '-g', '--node' or '-n', '--exclude' or
'-x', and '--exclude-group' or '-X'. This also works for dsh and dcp.
For example:

    # synctool -g batch,sched -X rack8

Another example:

    # dsh -n node1,node2,node3 date

or copy a file to these three nodes:

    # dcp -n node1,node2,node3 -d /tmp patchfile-1.0.tar.gz


You may also wish to pull a file from a node into the repository. You can do
this from the masternode like this:

    # synctool -n node1 --upload /path/to/file

It may be desirable to give the file a different group extension than the
default proposed by synctool:

    # synctool -n node1 --upload /path/to/file --suffix=somegroup

After rebooting a cluster, use dsh-ping to see if the nodes respond to ping
yet. You may also do this on a group of nodes:

    # dsh-ping -g rack4


The '-t' or '--tasks' option runs the de executables that are in tasks/
(if you also supply '-f'..!) These executables can also have classnames as
filename extension. They can be shell scripts or any other kind of
executables. This option is particularly useful for making system changes
that cannot be done easily by replacing a configuration file, like for
example installing new software packages. Mind to always include a check to
see whether the system change has already been made, or else it will always
keep installing the same software when it was already there. Doing system
changes through the tasks mechanism is highly recommended, for 2 reasons:

  * easy to see what changes are being done; all tasks are in tasks/
  * whenever a node is down, it can do the updates later to get back in sync

By default, '--tasks' is not being run. You have to explicitly specify this
argument to run tasks.

The '--unix' option produces unix-style output. This shows in standard shell
syntax just what synctool is about to do.
(Note: synctool does not apply changes by executing shell commands; all
operations are programmed in Python. This option is only a way of displaying
what synctool does, and may be useful when debugging).

  root@masternode:/# synctool --unix
  node3: # updating file /etc/xinetd.d/identd
  node3: mv /etc/xinetd.d/identd /etc/xinetd.d/identd.saved
  node3: umask 077
  node3: cp /var/lib/synctool/overlay/etc/xinetd.d/identd._all /etc/xinetd.d/identd
  node3: chown root.root /etc/xinetd.d/identd
  node3: chmod 0644 /etc/xinetd.d/identd
  node3:
  node3: # run command /bin/rm /etc/xinetd.d/*.saved ; /etc/init.d/xinetd reload
  node3: /bin/rm /etc/xinetd.d/*.saved ; /etc/init.d/xinetd reload


The '--skip-rsync' option skips the rsync run (that copies the repository
from the master server to the client node). You should only specify this
option when your repository resides on a shared filesystem. Sharing the
repository between your master server and client nodes has certain security
implications, so be mindful of what you are doing in such a setup.
If you have a fast shared filesystem between all client nodes, but it is
not shared with the master server, you may want to write a wrapper script
around synctool that first runs rsync to a single node to update the shared
repository, and then run synctool with the --skip-rsync option.


When using "--fix" to apply changes, synctool can log the performed actions
in a log file. Use the 'logfile' directive in synctool.conf to specify that
you want logging:

  logfile /var/log/synctool.log

synctool will write this logfile on each node seperately, and a concatenated
log on the master node.

By using directives in the synctool.conf file, synctool can be told to ignore
certain files, nodes, or groups. These will be excluded, skipped. For example:

  ignore_dotfiles no
  ignore_dotdirs yes
  ignore .svn
  ignore .gitignore
  ignore_node node1 node2
  ignore_group oldgroup
  ignore_group test


About symbolic links ...
On the Linux operating system, symbolic links always have mode 0777 (also
shown as lrwxrwxrwx). This is awkward, because this mode seems to imply that
owners, group members, and others have write access to the symbolic link --
which is not the case. This results in synctool complaining about the mode
of the symbolic link:

  /path/subdir/symlink should have mode 0755 (symlink), but has 0777

As a workaround, synctool forces the mode of the symbolic link to what
you set it to in synctool.conf. The hardcoded default value is 0755.

synctool.conf:
  # Linux
  symlink_mode 0777

or

  # sensible mode for most other Unix systems
  symlink_mode 0755


As synctool requires files in the repository to require an extension, so
do symbolic links. Symbolic links in the repository will be 'dead' symlinks
but they will point to the correct destination on the target node.
Consider the following example, where "file" does not exist as is in the
repository:

    $masterdir/overlay/etc/motd._red -> file
    $masterdir/overlay/etc/file._red


For any file synctool updates, it keeps a backup copy around on the target
node with the extension '.saved'. If you don't like this, you can tell
synctool to not make any backup copies with

  backup_copies no

It is however highly recommended that you run with 'backup_copies yes'.
You can manually specify that you want to remove all backup copies using
synctool -e or synctool --erase-saved --fix.


The settings in synctool.conf can be overridden locally by including a
second config file that is present on the node:

synctool.conf:
  include /etc/synctool_local.conf

The local config file may contain the same directives as the master config
file, but apply only to the node on which the file resides. The local config
file may be managed from the master repository, enabling you to have subtle
differences in the synctool configuration for certain nodes. For example,
this can be used to change the symlink_mode in heterogeneous clusters.
Beware that the local config file is read upon startup and may be synced
afterwards, if the file was changed.
Mind that including node specific configs increases the complexity of your
overall synctool configuration, so in general it is recommended that you
stick to using the master synctool.conf only, and not including any local
configs at all.
The 'include' keyword can also be used to clean up your config a little,
for example:

  include /var/lib/synctool/nodes.conf
  include /var/lib/synctool/on_update.conf

In this example all nodes will have the same config, because the masterdir
/var/lib/synctool/ is synced to all nodes.


synctool can check whether a new version of synctool itself is available by
using the '--check-update' option on the master node. You can check
periodically for updates by using --check-update in a crontab entry.
To download the latest version, run synctool '--download' on the master node.


EOB