Monitoring for IDVA microservices in cloud.gov
The IDVA project is composed of many different microservices, each needing to be monitored for performance, stability, and uptime. The monitoring microservice has the following goals:
- Provide monitoring capabilities for IDVA microservices
- Alert IDVA operators/admins on specified metric thresholds
IDVA monitoring is a set of monitoring tools that get deployed together to enable monitoring of the IDVA system. The repo is broken down by tool and contains:
- Prometheus: an HA Prometheus setup to monitor applications based on DNS querying of the application routes.
- For all applications we wish to monitor, adding a
dns_sd_config
within the prometheus-config.yml adds the application to prometheus's monitoring. By using thedns_sd_config
we are able to see and query all instances of the application, and are not load balanced to random instances every query.
- For all applications we wish to monitor, adding a
- Grafana: A simple dashboard setup to view some of the IDVA metrics in real-time
- Alertmanager: an HA Alertmanager cluster that handles alert deduplication and routing.
- Cortex: A single-binary-mode Cortex instance for shipping metrics to S3 for long-term storage.
- Watchtower: A run-anywhere, Cloud Foundry drift detection service (designed to be scraped by Prometheus).
- Kibana: Basic Kibana setup for quickly querying elasticsearch data in cloud.gov
- Elasticsearch: A Prometheus exporter for Elasticsearch metrics on cloud.gov
- Redis: A Prometheus Exporter for Redis metrics on cloud.gov
The config files are generic to prevent having to have multiple configuration files
per space (dev, test, prod, etc). The -config.yml files are intended to be
fed to envsubst
after the appropriate environment variable has been set. The config
should be output to the appropriate named files (see examples below).
# Examples of config file generation using envsubst
envsubst < prometheus/prometheus-config.yml > prometheus/prometheus.yml
envsubst < alertmanager/alert-config.yml > alertmanager/alertmanager.yml
The most up-to-date information about the CI/CD flows for this repo can be found in the GitHub workflows directory
This project is in the worldwide public domain. As stated in CONTRIBUTING:
This project is in the public domain within the United States, and copyright and related rights in the work worldwide are waived through the CC0 1.0 Universal public domain dedication.
All contributions to this project will be released under the CC0 dedication. By submitting a pull request, you are agreeing to comply with this waiver of copyright interest.