Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Complete rework of the role to simplify distributed site setup #53

Open
wants to merge 34 commits into
base: master
Choose a base branch
from

Conversation

ganto
Copy link
Member

@ganto ganto commented Apr 13, 2017

This is an elaboration on how to solve the setup of distributed sites as discussed in #45. It is still a work in progress but mostly functional.

I choose a new approach in setting up and connecting the distributed sites. Previously the host running the site was responsible for configuring and setting up the site. However, the chosen approach was limited to one site per server and required a cumbersome manual configuration for connecting multiple sites to a distributed site setup (check e.g. #41).

This PR introduces a new way. The distributed site setup is mainly configured on the host running the master site. Like this it will hold most information required to setup and synchronized with the slave sites. This makes it mostly transparent if the site is running on the master site server or another one.

Those changes required an adjustment of the checkmk_server__distributed_sites format. Now it will accept a list of master (and their slave) sites. The simplest possible configuration for a two server setup (server01/server02), where server01 is running the master and server02 a slave site, would look as following. Enable the debops-contrib.checkmk_server role on both servers, then set the following values in the inventory:

  • Define a slave site in the host_vars/server01:
checkmk_server__distributed_sites:
  - name: 'slave'
    delegate_to: 'server02'
  • Disable the default site in the host_vars/server02:
checkmk_server__site: False

Eventually run the provided checkmk_server.yml playbook on both servers. This would create the default site on server01 and add a slave site (called slave) on the second server. The entire user account/password, URL, network port, synchronization configuration is completely transparent to the user.

Technically this is solved by a checkmk_server/env sub-role which is mostly responsible for generating a facts file on server01 and server02 which then can be used by the playbook and subsequent tasks for their individual site configuration. Additionally a lot of tasks which need on server02 which need in-depth knowledge about server01 or the master site are triggered from the master server via delegate_to tasks.

The chosen setup further allows an easy solution for fixing #44. However, when testing and investigation Check_MK in the course of writing this code, I found another maybe much more elegant way to handle the site config synchronization. More on that later...

In the next few days I will cleanup the documentation, fix the still missing debops.apache dependency, try to cleanup the macros and leverage the official DebOps macros and provide a simple Vagrant setup to simplify testing and debugging for me and potential users. Stay tuned.

@@ -162,8 +160,7 @@ checkmk_server__webapi_url: '{{ checkmk_server__site_url + "/check_mk/webapi.py"
# :ref:`checkmk_server__ref_omd_config` for more details.
checkmk_server__omd_config: '{{
checkmk_server__omd_config_email +
checkmk_server__omd_config_core +
(checkmk_server__omd_config_livestatus if checkmk_server__multisite_livestatus|d() else [])
checkmk_server__omd_config_core
}}'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just looking into this when I saw your PR. I would propose to change this list to a dict.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the advantage?

Copy link
Member

@ypid ypid Apr 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is much more flexible to make changes to the final dict on various Ansible variable levels. That is what various core roles like debops.ferm debops.ifupdown are switching to or have already switched to. Refer to this example: https://github.com/debops-contrib/ansible-volkszaehler/blob/72f6a79c434d1d4c665a581506bc221507ee4608/defaults/main.yml#L383-L478

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll keep that in mind. Will still need to have a look at your DebOps macros anyway

This was referenced Apr 13, 2017
@ganto ganto force-pushed the distributed-rework branch from 432e67d to e51f79f Compare April 19, 2017 04:44
@ganto
Copy link
Member Author

ganto commented Apr 19, 2017

I rebased my patchset on current master to include all the recent feature and documentation updates.

# Configuration for debops.apache_ Ansible role.
checkmk_server__apache__dependent_vhosts:
- name: '{{ checkmk_server__fqdn }}'
by_role: 'debops-contrib.checkmk_server'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIce work! I guess the Apache omd snippet could be disabled and instead be included in checkmk_server__apache__dependent_vhosts to ensure that omd is only available for this vhost.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, I thought about that, but I wasn't sure how to properly do this. To add the snipped to the vhost I guess I have to define include: '/omd/apache/*.conf' but how do I get rid of the conf-enabled/zzz_omd.conf?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like:

checkmk_server__apache__dependent_snippets:                                                                                                                                                                        
  'zzz_omd':
    enabled: False

should do.

For the vhost, item.include: [ '/omd/apache/*.conf' ] can be tried.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the snippet configuration as you suggested. However, it fails with:

TASK [debops.apache : Create conf-available snippets] ********************************************************************************************
[...]
failed: [cmk01] (item={'key': u'zzz_omd', 'value': {u'enabled': False}}) => {
    "failed": true, 
    "item": {
        "key": "zzz_omd", 
        "value": {
            "enabled": false
        }
    }, 
    "msg": "Unable to find 'etc/apache2/conf-available/zzz_omd.conf.j2' in expected paths."
}

I didn't find a successful way to tell the role that this configuration is provided externally. Any suggestion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mistake, excuse me. Try:

checkmk_server__apache__dependent_snippets:                                                                                                                                                                        
  'zzz_omd':
    enabled: False
    type: 'dont-create'

type: 'dont-create' is there specifically for your usecase, as documented 😉
Have you seen the nice documenteion of the role btw? Ref: https://docs.debops.org/en/latest/ansible/roles/ansible-apache/docs/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, my bad. I read the documentation, but not far enough 😉 I had in mind there is only raw and divert. I guess I was a bit confused by the following sentence:

Type: raw

Available when ``item.type`` is set to ``raw`` or ``divert``.

@@ -11,14 +11,21 @@ galaxy_info:
author: Reto Gantenbein
description: 'Setup Check_MK monitoring server'
license: 'GPL-3.0'
min_ansible_version: '2.1.5'
min_ansible_version: '2.3.0'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to drop 2.2 support as well? a3a149a does not mention one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I didn't mention the reason here. Actually, I was bitten by ansible/ansible#14542. Ansible 2.3.0 fixed the issue for me. As mentioned in the linked PR it wasn't fixed in the 2.2.x tree back then and I never tried to find a release who does. 2.3.0 is the safe choice. Will update the commit message accordingly.

Lower Ansible versions might suffer from Ansible issue #14542.
Version 2.3.0 has been tested to solve the issue.
@ganto
Copy link
Member Author

ganto commented May 19, 2017

In the mean time I achieved to satisfy the repospec test-suite. It will pass when being run from my own repository (which still misses 466e8bd). The results can be found at travis-ci.org: ganto/checkmk_server (branch distributed-rework). Once this PR is ready for merge, I will update debops/test-suite which should make it pass here too.

The etc/ansible/facts.d/checkmk_server.j2 template is cleaned up now. Next will be the lookup/checkmk_server__sites.j2 which again might shake up the entire configuration layout a bit.

@ganto ganto force-pushed the distributed-rework branch from a3a149a to 4d0e8ca Compare May 19, 2017 05:31
@ypid
Copy link
Member

ypid commented Mar 11, 2018

@ganto Do you know when you can finish this PR? I would still like to look into the latest version of the role and make changes to it :)

@ganto
Copy link
Member Author

ganto commented Apr 13, 2018

@ypid What changes to you have in mind?

At the moment my development activity on this role is stalled. I realized that my approach would also write down the login secrets for the site synchronization as clear text local ansible facts. That definitely shouldn't be. 👎

Maybe I find again some time to work on it in the next few months...

@ypid
Copy link
Member

ypid commented Apr 18, 2018

What changes to you have in mind?

What ever is necessary. I would just like to have a clean state of the role so that I could propose bigger changes/reworks if I want to without causing pain with rebasing or stuff like this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants