-
Notifications
You must be signed in to change notification settings - Fork 786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bats tests - parallelize #5552
base: main
Are you sure you want to change the base?
bats tests - parallelize #5552
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: edsantiago The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
d118eb9
to
4a543fd
Compare
Ephemeral COPR build failed. @containers/packit-build please check. |
762e878
to
a195865
Compare
a195865
to
d05bc18
Compare
d05bc18
to
053354d
Compare
4d4e9c9
to
e653a16
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume once we enabled parallel runs here we can port that over for the bud tests on podman? I like to get the speed up there too.
I haven't looked deeply into the prefetch logic changes but this looks much easier than podman so that is good.
tests/test_runner.sh
Outdated
@@ -19,4 +19,4 @@ function execute() { | |||
TESTS=${@:-.} | |||
|
|||
# Run the tests. | |||
execute time bats --tap $TESTS | |||
execute time bats -j 4 --tap $TESTS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would use $(nproc)
here instead of hard coding any count, should make it much faster when run locally.
e653a16
to
31b2ab6
Compare
b01284c
to
aed1bc5
Compare
A friendly reminder that this PR had no activity for 30 days. |
(A brief look after a link from containers/skopeo#2437 )
For this purpose, it would be better to have the image’s layers be clearly independent from any other test — maybe a |
… oh, and: debug-level logs could help. |
Thank you. I thought I had checked for conflicts, but must've missed something. I'll look into this again when time allows. |
d17cb1b
to
68722ca
Compare
Where is that |
I can’t see any obvious reason. I’d suggest doing the pushes with |
68722ca
to
55a1cdd
Compare
I understand this pull request is in draft state, but I got this warning on OpenScanHub:
It should be fixed before merging this pull request. |
3fb8b6f
to
5ff53d1
Compare
Got it with debug. In-page search for |
5ff53d1
to
aa0f9bb
Compare
aa0f9bb
to
0ef22be
Compare
Note to self:
OK. Then:
How did we lose the compression format knowledge?? That’s not the immediate cause, but it suggests something unexpected is happening. Both have |
The test image is created via
It is the same image used for each test iteration ( One approach I've considered but not tried: build a new image on each iteration. My gut tells me that this might get tests to pass, but is not necessarily the right thing to do. It depends on whether this issue is a real one that we might be sweeping under the rug. |
The _prefetch helper, introduced in containers#2036, is not parallel-safe: two or more parallel jobs fetching the same image can step on each other and produce garbage images. Although we still can't run buildah tests in parallel (see containers#5552), we can at least set up the scaffolding for that to happen. This commit reworks _prefetch() such that the image work is wrapped inside flock. It has been working fine for months in containers#5552, and is IMO safe for production. This can then make it much easier to flip the parallelization switch once the final zstd bug is squashed. Signed-off-by: Ed Santiago <[email protected]>
The _prefetch helper, introduced in containers#2036, is not parallel-safe: two or more parallel jobs fetching the same image can step on each other and produce garbage images. Although we still can't run buildah tests in parallel (see containers#5552), we can at least set up the scaffolding for that to happen. This commit reworks _prefetch() such that the image work is wrapped inside flock. It has been working fine for months in containers#5552, and is IMO safe for production. This can then make it much easier to flip the parallelization switch once the final zstd bug is squashed. Signed-off-by: Ed Santiago <[email protected]>
All bats tests run with custom root/runroot, so it should be possible to parallelize them. Signed-off-by: Ed Santiago <[email protected]>
Signed-off-by: Ed Santiago <[email protected]>
Signed-off-by: Ed Santiago <[email protected]>
Signed-off-by: Ed Santiago <[email protected]>
0ef22be
to
b380e9a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is about as clean as I can leave this PR. Good luck!
@@ -165,6 +165,7 @@ _EOF | |||
# Helper function. push our image with the given options, and run skopeo inspect | |||
function _test_buildah_push() { | |||
run_buildah push \ | |||
--log-level=debug \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FIXME! Remove this. It is only present in order to debug the zstd flake.
@@ -22,6 +22,7 @@ function mkcw_check_image() { | |||
|
|||
# Decrypt, mount, and take a look around. | |||
uuid=$(cryptsetup luksUUID "$mountpoint"/disk.img) | |||
echo "# uuid=$uuid" >&3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't needed either.
All bats tests run with custom root/runroot, so it should be
possible to parallelize them.
(As of this initial commit, tests fail on my laptop, and I expect them to fail here. I just want to get a sense for how things go.)
Signed-off-by: Ed Santiago [email protected]