Speed up deploy optimistically #106

yosifkit · 2025-01-25T00:52:01Z

Trust that if an item was pushed successfully in a previous deploy, then we don't need to push it this time.

We can "Replay" a job with an adjusted filter stage to skip the filtering (i.e., skip the wget and just use an empty past-deploy.json) to get the previous behavior of pushing everything.

Test jobs:

https://doi-janky.infosiftr.net/job/meta/view/deploys/job/arm32v5/job/deploy/11747/ (no previous deploy.json)
https://doi-janky.infosiftr.net/job/meta/view/deploys/job/arm32v5/job/deploy/11748/ (successful good path)
https://doi-janky.infosiftr.net/job/meta/view/deploys/job/arm32v5/job/deploy/11749/ (purposely falling with bad archived artifacts)
https://doi-janky.infosiftr.net/job/meta/view/deploys/job/arm32v5/job/deploy/11750/ (not affected by past failed job, downloads past deploy.json from lastSuccessfulBuild)

yosifkit · 2025-01-25T00:53:58Z

With this, we could likely change the rateLimitBuilds to allow the job to run more often.

yosifkit · 2025-01-25T01:05:12Z

Time testing with the large deploy.json from amd64:

$ time jq -L. 'include "deploy"; arch_tagged_manifests("amd64") | deploy_objects[]' ../meta/builds.json > deploy.json

real    0m3.578s
user    0m3.504s
sys     0m0.040s
$ cp deploy.json past-deploy.json
$ ls -lnh *deploy.json
-rw-r--r-- 1 1000 1000 3.3M Jan 24 16:59 deploy.json
-rw-r--r-- 1 1000 1000 3.3M Jan 24 16:59 past-deploy.json
$ time jq --slurpfile past past-deploy.json 'select( IN($past[]) | not )' ./deploy.json

real    0m0.694s
user    0m0.691s
sys     0m0.000s

tianon · 2025-01-28T18:35:12Z

Jenkinsfile.deploy

+					! wget --timeout=5 -qO past-deploy.json "$JOB_URL/lastSuccessfulBuild/artifact/deploy.json" \\
+					|| ! jq 'empty' past-deploy.json \\
+				; then
+					touch past-deploy.json


With the second half of the conditional checking for valid JSON, this should probably do truncate with a zero size instead, right? (or we remove that check and let the jobs fail until we take explicit action, which is maybe safer since it'd be something unexpected and failing is easier to notice than slower)

Oh, yeah. let me adjust it. 🤦

🤔 We only need the if ! wget for the first time it is run. After that, does it make sense to fail if there isn't one? So, just as committed in the PR or perhaps without the jq test the first time and once they all run, drop the if and just let wget fail?

if ! wget --timeout=5 -qO past-deploy.json "$JOB_URL/lastSuccessfulBuild/artifact/deploy.json"; then touch past-deploy.json fi

Hmm yeah, we could do that, although if we ever need to bootstrap again it's a little bit of a hassle.

We could even start with the raw failing wget version, perhaps leaving the touch version commented so we have an easier "replay" target and use replay to bootstrap all of them the first time?

(then we also have an obvious place to leave a comment right where we might investigate in the future, describing how if anything goes wrong here and we need to re-bootstrap, here's how to do it)

I have updated it to fail if the wget fails with a note and touch command for bootstrapping.

Trust that if an item was pushed successfully in a previous deploy, then we don't need to push it this time

tianon

This is great -- we also need to couple this with a new job that does "trust ... but verify", but that can reasonably be a separate disconnected effort. 👍

We probably should wait to merge this until at least tomorrow, especially with the changes we deployed today (so we have enough time to make sure they're actually reasonably stable before we ramp down the load we're putting on them).

yosifkit requested a review from a team as a code owner January 25, 2025 00:52

tianon reviewed Jan 28, 2025

View reviewed changes

Speed up deploy optimistically

bedef05

Trust that if an item was pushed successfully in a previous deploy, then we don't need to push it this time

yosifkit force-pushed the save-deploy branch from 31b8f62 to bedef05 Compare January 29, 2025 00:04

tianon approved these changes Jan 31, 2025

View reviewed changes

This was referenced Jan 31, 2025

Implement parallelism in deploy #105

Merged

(re-)Implement parallelism in deploy #110

Merged

tianon merged commit 4f22d05 into docker-library:main Jan 31, 2025
1 check passed

tianon deleted the save-deploy branch January 31, 2025 17:11

tianon mentioned this pull request Jan 31, 2025

Remove rateLimitBuilds from deploy jobs #112

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up deploy optimistically #106

Speed up deploy optimistically #106

yosifkit commented Jan 25, 2025

yosifkit commented Jan 25, 2025

yosifkit commented Jan 25, 2025

tianon Jan 28, 2025

yosifkit Jan 28, 2025

yosifkit Jan 28, 2025

tianon Jan 28, 2025

tianon Jan 28, 2025

tianon Jan 28, 2025

yosifkit Jan 29, 2025

tianon left a comment

Speed up deploy optimistically #106

Speed up deploy optimistically #106

Conversation

yosifkit commented Jan 25, 2025

yosifkit commented Jan 25, 2025

yosifkit commented Jan 25, 2025

tianon Jan 28, 2025

Choose a reason for hiding this comment

yosifkit Jan 28, 2025

Choose a reason for hiding this comment

yosifkit Jan 28, 2025

Choose a reason for hiding this comment

tianon Jan 28, 2025

Choose a reason for hiding this comment

tianon Jan 28, 2025

Choose a reason for hiding this comment

tianon Jan 28, 2025

Choose a reason for hiding this comment

yosifkit Jan 29, 2025

Choose a reason for hiding this comment

tianon left a comment

Choose a reason for hiding this comment