From e48456dbbc8ff3731d8fea70f805e060b8a4c4d7 Mon Sep 17 00:00:00 2001 From: Will Hickey Date: Tue, 13 Sep 2022 14:33:49 -0500 Subject: [PATCH] Create continuous delivery directory and add readme with canary details. (#27683) * Create continuous delivery directory and add readme with canary details. --- cd/README.md | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) create mode 100644 cd/README.md diff --git a/cd/README.md b/cd/README.md new file mode 100644 index 00000000000000..fb2864974861f1 --- /dev/null +++ b/cd/README.md @@ -0,0 +1,33 @@ +## Canaries +In order to reduce the risk associated with deploying updates Solana Labs operates the validator software on canary nodes on mainnet-beta +and testnet. The mainnet-beta nodes are non-voting. These nodes update themselves on a regular schedule. + +|Canary|Host|Dashboards|Identity|Cluster|Channel|Days between updates| +|------|----|----------|--------|-------|-------|-------------------:| +|mce1|edge-validator-us-sv15|[Cluster](https://metrics.solana.com:3000/d/monitor-beta/cluster-telemetry-beta?orgId=1&var-datasource=InfluxDB_main-beta&var-testnet=mainnet-beta&var-hostid=edge8WksfN71qYDJ1e4fCy2WfKg19fXU5zuztDi9uTM) [System](https://metrics.solana.com:3000/d/rYdddlPWkk/system-metrics-full?orgId=1&var-DS_PROMETHEUS=Metrics-Prometheus&var-job=All&var-node=edge-validator-us-sv15:9100&var-diskdevices=%5Ba-z%5D%2B%7Cnvme%5B0-9%5D%2Bn%5B0-9%5D%2B&refresh=5m&from=now-15m&to=now)|edge8WksfN71qYDJ1e4fCy2WfKg19fXU5zuztDi9uTM|mainnet-beta|edge|2| +|mce2|canary-am6-2|[Cluster](https://metrics.solana.com:3000/d/monitor-beta/cluster-telemetry-beta?orgId=1&var-datasource=InfluxDB_main-beta&var-testnet=mainnet-beta&var-hostid=mce2CKApCodefxDBUDWXCdBkqoh2dg1vpWJJX2qfuvV) [System](https://metrics.solana.com:3000/d/rYdddlPWkk/system-metrics-full?orgId=1&var-DS_PROMETHEUS=Metrics-Prometheus&var-job=All&var-node=canary-am6-2:9100&var-diskdevices=%5Ba-z%5D%2B%7Cnvme%5B0-9%5D%2Bn%5B0-9%5D%2B&from=now-15m&to=now&refresh=5m)|mce2CKApCodefxDBUDWXCdBkqoh2dg1vpWJJX2qfuvV|mainnet-beta|edge|4| +|mce3|canary-da11-1|[Cluster](https://metrics.solana.com:3000/d/monitor-beta/cluster-telemetry-beta?orgId=1&var-datasource=InfluxDB_main-beta&var-testnet=mainnet-beta&var-hostid=mce3QKzeRNwk3TC6XfHTy5hdRT6u5UKm4rKQbNKkFhF) [System](https://metrics.solana.com:3000/d/rYdddlPWkk/system-metrics-full?orgId=1&var-DS_PROMETHEUS=Metrics-Prometheus&var-job=All&var-node=canary-da11-1%3A9100&var-diskdevices=%5Ba-z%5D%2B%7Cnvme%5B0-9%5D%2Bn%5B0-9%5D%2B&refresh=5m&from=now-15m&to=now)|mce3QKzeRNwk3TC6XfHTy5hdRT6u5UKm4rKQbNKkFhF|mainnet-beta|edge|8| +|mcb1|beta-validator-us-ny5|[Cluster](https://metrics.solana.com:3000/d/monitor-beta/cluster-telemetry-beta?orgId=1&var-datasource=InfluxDB_main-beta&var-testnet=mainnet-beta&var-hostid=betaVcnkBhHKaWx9o6LoSYrGaoDCskQLm94cUVWqDLS) [System](https://metrics.solana.com:3000/d/rYdddlPWkk/system-metrics-full?orgId=1&var-DS_PROMETHEUS=Metrics-Prometheus&var-job=Mainnet-Beta&var-node=beta-validator-us-ny5%3A9100&var-diskdevices=%5Ba-z%5D%2B%7Cnvme%5B0-9%5D%2Bn%5B0-9%5D%2B&refresh=5m&from=now-15m&to=now)|betaVcnkBhHKaWx9o6LoSYrGaoDCskQLm94cUVWqDLS|mainnet-beta|beta|2| +|mcb2|canary-ny5-2|[Cluster](https://metrics.solana.com:3000/d/monitor-beta/cluster-telemetry-beta?orgId=1&var-datasource=InfluxDB_main-beta&var-testnet=mainnet-beta&var-hostid=mcb2qZRoYgy4bJkPRv6cBwLAAYow9ZsSzcrjJKprUnd) [System](https://metrics.solana.com:3000/d/rYdddlPWkk/system-metrics-full?orgId=1&var-DS_PROMETHEUS=Metrics-Prometheus&var-job=All&var-node=canary-ny5-2:9100&var-diskdevices=%5Ba-z%5D%2B%7Cnvme%5B0-9%5D%2Bn%5B0-9%5D%2B&refresh=5m&from=now-15m&to=now)|mcb2qZRoYgy4bJkPRv6cBwLAAYow9ZsSzcrjJKprUnd|mainnet-beta|beta|4| +|mcs1|canary-am6-1|[Cluster](https://metrics.solana.com:3000/d/monitor-beta/cluster-telemetry-beta?orgId=1&var-datasource=InfluxDB_main-beta&var-testnet=mainnet-beta&var-hostid=mcs1kpUkWeqoruxWwtCskY1GGF4Bx1t3MMtHSHoSLyC) [System](https://metrics.solana.com:3000/d/rYdddlPWkk/system-metrics-full?orgId=1&var-DS_PROMETHEUS=Metrics-Prometheus&var-job=All&var-node=canary-am6-1:9100&var-diskdevices=%5Ba-z%5D%2B%7Cnvme%5B0-9%5D%2Bn%5B0-9%5D%2B&refresh=5m&from=now-15m&to=now)|mcs1kpUkWeqoruxWwtCskY1GGF4Bx1t3MMtHSHoSLyC|mainnet-beta|stable|2| +|mcs2|canary-ny5-1|[Cluster](https://metrics.solana.com:3000/d/monitor-beta/cluster-telemetry-beta?orgId=1&var-datasource=InfluxDB_main-beta&var-testnet=mainnet-beta&var-hostid=mcs2ZZa1vHhbUyfZ6ttHEGpFU6pib4pm4ownTxBm6Jc) [System](https://metrics.solana.com:3000/d/rYdddlPWkk/system-metrics-full?orgId=1&var-DS_PROMETHEUS=Metrics-Prometheus&var-job=All&var-node=canary-ny5-1:9100&var-diskdevices=%5Ba-z%5D%2B%7Cnvme%5B0-9%5D%2Bn%5B0-9%5D%2B&refresh=5m&from=now-15m&to=now)|mcs2ZZa1vHhbUyfZ6ttHEGpFU6pib4pm4ownTxBm6Jc|mainnet-beta|stable|4| +|tce1|canary-sg1-1|[Cluster](https://metrics.solana.com:3000/d/monitor-beta/cluster-telemetry-beta?orgId=1&var-datasource=InfluxDB-testnet&var-testnet=tds&var-hostid=tce1TNB7pchMpM36eyzvttBDEwTczv86o5P2SS8dpSU) [System](https://metrics.solana.com:3000/d/rYdddlPWkk/system-metrics-full?orgId=1&var-DS_PROMETHEUS=Metrics-Prometheus&var-job=All&var-node=canary-sg1-1:9100&var-diskdevices=%5Ba-z%5D%2B%7Cnvme%5B0-9%5D%2Bn%5B0-9%5D%2B&refresh=5m&from=now-15m&to=now)|tce1TNB7pchMpM36eyzvttBDEwTczv86o5P2SS8dpSU|testnet|edge|4| +|tcb1|canary-sv15-1|[Cluster](https://metrics.solana.com:3000/d/monitor-beta/cluster-telemetry-beta?orgId=1&var-datasource=InfluxDB-testnet&var-testnet=tds&var-hostid=tcb1JYngtdwigQaJV6t13TJSnKuEPitpwoHS5TAYg1H) [System](https://metrics.solana.com:3000/d/rYdddlPWkk/system-metrics-full?orgId=1&var-DS_PROMETHEUS=Metrics-Prometheus&var-job=All&var-node=canary-sv15-1:9100&var-diskdevices=%5Ba-z%5D%2B%7Cnvme%5B0-9%5D%2Bn%5B0-9%5D%2B&refresh=5m&from=now-15m&to=now)|tcb1JYngtdwigQaJV6t13TJSnKuEPitpwoHS5TAYg1H|testnet|beta|4| + +The canaries update themselves every few days according to the schedule above. They should not be updated manually. The updates get the newest build for the given channel which is often not a tagged release. + +These are handy commands to see what versions are currently running on each node: +``` +solana gossip -um | grep -E " (edge|beta|mc[ebs]\d)" + +solana gossip -ut | grep -E " (tc[eb]\d)" +``` + +Nodes update themselves when `(number of days since 1970-01-01)` mod `(days between updates)` is 0. Use this command (and the table above) to check when a node will update: + +``` +DAYS_BETWEEN_UPDATES=8; d=$(expr $(date +%s) / 86400 % $DAYS_BETWEEN_UPDATES); n=$(expr $DAYS_BETWEEN_UPDATES - $d); echo "Updated $d day(s) ago. Will update $n day(s) from now" +``` + +# Alerts +Canaries are monitored by watchtower and alert in Slack #pager-duty-canary