Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor scripts to facilitate dynamic monitoring #7

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,6 @@
/tf/.terraform.lock.hcl
/ansible/testnet/*
/ansible/hosts
# These files are auto-generated during deployment
/ansible/secrets.yaml
/ansible/roles/telegraf/vars/secret.yaml
54 changes: 19 additions & 35 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,50 +1,34 @@
DO_INSTANCE_TAGNAME=v035-testnet
LOAD_RUNNER_COMMIT_HASH ?= 51685158fe36869ab600527b852437ca0939d0cc
LOAD_RUNNER_CMD=go run github.com/tendermint/tendermint/test/e2e/runner@$(LOAD_RUNNER_COMMIT_HASH)
E2E_RUNNER_VERSION=v0.35.5
export DO_INSTANCE_TAGNAME
export LOAD_RUNNER_CMD
export E2E_RUNNER_VERSION

.PHONY: terraform-init
terraform-init:
.PHONY: init
init:
$(MAKE) -C ./tf/ init

.PHONY: terraform-apply
terraform-apply:
.PHONY: deploy
deploy:
$(MAKE) -C ./tf/ apply
./script/configgen.sh ./ansible/hosts
./script/secretsgen.sh ./ansible/secrets.yaml
ANSIBLE_HOST_KEY_CHECKING=False \
ansible-playbook -i ./ansible/hosts -u root ./ansible/deploy.yaml -f 10

.PHONY: hosts
hosts:
echo "[validators]" > ./ansible/hosts
doctl compute droplet list --tag-name $(DO_INSTANCE_TAGNAME) --tag-name "testnet-node" | tail -n+2 | tr -s ' ' | cut -d' ' -f2,3 | sort -k1 | sed 's/\(.*\) \(.*\)/\2 name=\1/g' >> ./ansible/hosts
echo "[prometheus]" >> ./ansible/hosts
doctl compute droplet list --tag-name $(DO_INSTANCE_TAGNAME) --tag-name "testnet-observability" | tail -n+2 | tr -s ' ' | cut -d' ' -f3 >> ./ansible/hosts

.PHONY: configgen
configgen:
./script/configgen.sh `tail -n+2 ./ansible/hosts | head -n -2 |cut -d' ' -f1| paste -s -d, -`

.PHONY: ansible-install
ansible-install:
cd ansible && \
ansible-playbook -i hosts -u root base.yaml -f 10 && \
ansible-playbook -i hosts -u root prometheus-node-exporter.yaml -f 10 && \
ansible-playbook -i hosts -u root init-testapp.yaml -f 10 && \
ansible-playbook -i hosts -u root update-testapp.yaml -f 10

.PHONY: prometheus-init
prometheus-init:
cd ansible && ansible-playbook -i hosts -u root prometheus.yaml -f 10

.PHONY: start-network
start-network:
cd ansible && ansible-playbook -i hosts -u root start-testapp.yaml -f 10
.PHONY: update-testapp
update-testapp:
./script/configgen.sh ./ansible/hosts
ANSIBLE_HOST_KEY_CHECKING=False \
ansible-playbook -i ./ansible/hosts -u root ./ansible/update-testapp.yaml

.PHONY: runload
runload:
$(LOAD_RUNNER_CMD) load \
--ip-list `tail -n+2 ./ansible/hosts | head -n -2 |cut -d' ' -f1| paste -s -d, -` \
--seed-delta $(shell echo $$RANDOM)
./script/runload.sh ./ansible/hosts

.PHONY: terraform-destroy
terraform-destroy:
.PHONY: destroy
destroy:
$(MAKE) -C ./tf/ destroy

56 changes: 38 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ test networks on Digital Ocean (DO).
- [Ansible CLI][Ansible]
- Go

## Instructions
## Deployment

After you have all the prerequisites installed and have configured your
[`testnet.toml`](./testnet.toml) file appropriately:
Expand All @@ -32,30 +32,49 @@ ssh_keys = ["ab:cd:ef:01:23:45:67:89:ab:cd:ef:01:23:45:67:89"]
EOF

# 4. Initialize Terraform (only needed once)
make terraform-init
make init

# 5. Create the VMs for the validators and Prometheus as specified in ./testnet.toml
# Be sure to use your actual DO token and SSH key fingerprints for the DO_TOKEN
# and DO_SSH_KEYS variables.
make terraform-apply
# 5. Create the VMs for the validators and monitoring server as specified in
# ./testnet.toml
make deploy

# 6. Discover the IP addresses of the hosts for Ansible
make hosts
# 6. Execute a load test against the network
make runload
```

# 7. Generate the testnet configuration
make configgen
## Data visualization

# 8. Install all necessary software on the created VMs using Ansible
make ansible-install
Once you have deployed a testnet, there will be a "monitor" server available
running an [InfluxDB] instance. Check the generated `ansible/hosts` file for the
IP address of the monitor and navigate to `http://<monitor-ip>:8086` in your web
browser to access the InfluxDB interface.

# 9. Initialize the Prometheus instance
make prometheus-init
The username is `admin` and the password is automatically generated during
deployment. The password can be found in the `ansible/secrets.yaml` file (not
committed to the repository).

# 10. Start the test application on all of the validators
make start-network
The UI is relatively straightforward, but if you need additional help please
see the [InfluxDB docs][InfluxDB].

# 11. Execute a load test against the network
make runload
## Reloading the test app

In cases where you don't want to tear down the infrastructure and only want to
reload the test app running across the network (say there are new changes on the
`v0.35.x` branch in the Git repo):

```bash
make update-testapp
```

This will stop the test app, remove all config and data, redeploy the config,
and restart the test app.

## Teardown

To destroy all Digital Ocean infrastructure:

```bash
make destroy
```

## Metrics
Expand All @@ -68,3 +87,4 @@ metrics and view their associated graphs.
[Ansible]: https://docs.ansible.com/ansible/latest/index.html
[Terraform]: https://www.terraform.io/docs
[doctl]: https://docs.digitalocean.com/reference/doctl/how-to/install/
[InfluxDB]: https://docs.influxdata.com/influxdb/v2.2/
22 changes: 0 additions & 22 deletions ansible/base.yaml

This file was deleted.

19 changes: 0 additions & 19 deletions ansible/config-deploy.yaml

This file was deleted.

25 changes: 25 additions & 0 deletions ansible/deploy.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
---
# This playbook must be executed as root.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming you mean root on the machine you're deploying to?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'll clarify that 🙂

#
# It's also critical that the monitor is deployed first before the nodes
# because the monitor deployment generates an API token for Telegraf instances
# on the nodes to access the InfluxDB database on the monitor.
- hosts: monitor
become: no
vars_files:
- ./vars.yaml
- ./secrets.yaml
roles:
- common
- influxdb

- hosts: nodes
become: no
vars_files:
- ./vars.yaml
- ./secrets.yaml
roles:
- common
- telegraf
- tendermint
- testapp
14 changes: 0 additions & 14 deletions ansible/init-testapp.yaml

This file was deleted.

19 changes: 0 additions & 19 deletions ansible/prometheus-node-exporter.yaml

This file was deleted.

29 changes: 0 additions & 29 deletions ansible/prometheus.yaml

This file was deleted.

23 changes: 0 additions & 23 deletions ansible/remove-testapp-data.yaml

This file was deleted.

11 changes: 0 additions & 11 deletions ansible/restart-testapp.yaml

This file was deleted.

17 changes: 17 additions & 0 deletions ansible/roles/common/files/iptables-rules.v4
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Allow SSH on port 22 and related traffic. Rate-limit SSH login attempts.
# Log and drop failed SSH logins.
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [368:94560]
:LOGDROP - [0:0]
-A INPUT -i lo -j ACCEPT
-A INPUT -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A INPUT -p tcp -m tcp --dport 22 -m conntrack --ctstate NEW -m recent --update --seconds 60 --hitcount 11 --name DEFAULT --mask 255.255.255.255 --rsource -j LOGDROP
-A INPUT -p tcp -m tcp --dport 22 -m conntrack --ctstate NEW -m recent --set --name DEFAULT --mask 255.255.255.255 --rsource
-A INPUT -p tcp -m tcp --dport 22 -j ACCEPT
-A INPUT -m limit --limit 5/min -j LOG --log-prefix "iptables denied: " --log-level 7
-A INPUT -j DROP
-A LOGDROP -j LOG --log-prefix "iptables denied ssh: " --log-level 7
-A LOGDROP -j DROP
COMMIT
17 changes: 17 additions & 0 deletions ansible/roles/common/files/iptables-rules.v6
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Allow SSH on port 22 and related traffic. Rate-limit SSH login attempts.
# Log and drop failed SSH logins.
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:LOGDROP - [0:0]
-A INPUT -i lo -j ACCEPT
-A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A INPUT -p tcp -m tcp --dport 22 -m conntrack --ctstate NEW -m recent --update --seconds 60 --hitcount 11 --name DEFAULT --mask ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff --rsource -j LOGDROP
-A INPUT -p tcp -m tcp --dport 22 -m conntrack --ctstate NEW -m recent --set --name DEFAULT --mask ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff --rsource
-A INPUT -p tcp -m tcp --dport 22 -j ACCEPT
-A INPUT -m limit --limit 5/min -j LOG --log-prefix "ip6tables denied: " --log-level 7
-A INPUT -j DROP
-A LOGDROP -j LOG --log-prefix "ip6tables denied ssh: " --log-level 7
-A LOGDROP -j DROP
COMMIT
27 changes: 27 additions & 0 deletions ansible/roles/common/tasks/main.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
- name: install common dependencies
ansible.builtin.apt:
name:
- iptables
- iptables-persistent
state: latest
update_cache: yes
cache_valid_time: 60

- name: ensure persistent iptables dir exists
ansible.builtin.file:
path: /etc/iptables
state: directory

- name: copy base iptables rules
ansible.builtin.copy:
src: "iptables-{{ item }}"
dest: "/etc/iptables/{{ item }}"
loop:
- rules.v4
- rules.v6

- name: apply base ipv4 iptables rules
ansible.builtin.shell: "iptables-restore /etc/iptables/rules.v4"

- name: apply base ipv6 iptables rules
ansible.builtin.shell: "ip6tables-restore /etc/iptables/rules.v6"
4 changes: 4 additions & 0 deletions ansible/roles/influxdb/files/config.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
bolt-path = "/var/lib/influxdb/influxd.bolt"
engine-path = "/var/lib/influxdb/engine"
reporting-disabled = true
http-bind-address = ":8086"
Loading