-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tests/e2e: Remove kata-containers test #299
tests/e2e: Remove kata-containers test #299
Conversation
/test |
Thanks @stevenhorsman ! Need to remove the call to clone_kata_tests in c1fd569#diff-33bd1d8cc1a7b67275bc3a8a02470ba638f978798df7501e70ec085649aa79e6R69 |
c1fd569
to
23e0f4f
Compare
/test |
23e0f4f
to
a045f0d
Compare
/test |
@stevenhorsman removing below code I think we are ready to merge:
|
a045f0d
to
3617f2a
Compare
Good idea to remove some more code. Updated and added your co-authorship for the suggestions. |
/test |
Working locally now as well:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks once again @stevenhorsman !
Remove the kata-containers `CCv0` e2e tests in the operator as these tests are on a dead branch of a dead repo We also have had issues with them not being suitable as they programmatically change the configuration, which isn't likely to work in GHA, but isn't what we want to test either. Later we should add an e2e smoke test that ensures that the kata-payload integrated with things like nydus-snapshotter than the operator is responsible for setting up and configuring as discussed in confidential-containers#239 Collapse all the runtime classes to do the same tests for now and then when we add the smoke test we can make that have the smarts to handle all the platforms Signed-off-by: stevenhorsman <[email protected]> Co-authored-by: Wainer dos Santos Moschetta <[email protected]>
ed67c61
to
143bd0d
Compare
/test |
@BbolroC - hey Choi, we're seeing failures with the s390x tests here (install works, but the uninstall fails during the install step it looks like), do you know if they should be reliable, or is this an known issue? |
FYI the kata-qemu job is disabled: http://jenkins.katacontainers.io/view/all/job/confidential-containers-operator-main-ubuntu-22.04-x86_64-containerd_kata-qemu-PR/, so I will remove that from being required status check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice
It should be reliable as we can see a stable test result for the SE-enabled cc-operator nightly test at http://jenkins.katacontainers.io/job/confidential-containers-operator-main-ubuntu-20.04-s390x-SE-daily/. I will re-run the test and see what happened. |
Just an update here - I've tested the behaviour on a z VSI and see the same result as the tests, where cc-operator-pre-install-daemon pod is not running in the second test, so it seems like this is a valid failure and weirdly if I run the
so it seems to be related to this PR, but I can't explain why. I'll try and work with Choi (who is on vacation today) on Monday to try and get to the bottom of it, unless anyone else has any ideas? |
Since switching to the |
hi @stevenhorsman !
I could reproduce in an x86_64 VM. I'm using the CCv0 branch (i.e. I don't have #298) Looking at the nodes labels, I see
|
@stevenhorsman when I switch the label to false, I get the pre-install pod running (although failing). However, I have no idea yet when running with kata tests it simply works... no idea... ubuntu@myvm:~/operator/tests/e2e$ sudo -E kubectl get pods -n confidential-containers-system
NAME READY STATUS RESTARTS AGE
cc-operator-controller-manager-ccbbcfdf7-bq9rx 2/2 Running 0 3h37m
cc-operator-daemon-install-krrng 1/1 Running 0 3h37m
ubuntu@myvm:~/operator/tests/e2e$ sudo -E kubectl label nodes $(hostname) cc-preinstall/done=false
error: 'cc-preinstall/done' already has a value (true), and --overwrite is false
ubuntu@myvm:~/operator/tests/e2e$ sudo -E kubectl label nodes $(hostname) --overwrite cc-preinstall/done=false
node/myvm labeled
ubuntu@myvm:~/operator/tests/e2e$ sudo -E kubectl get pods -n confidential-containers-system
NAME READY STATUS RESTARTS AGE
cc-operator-controller-manager-ccbbcfdf7-bq9rx 2/2 Running 0 3h38m
cc-operator-daemon-install-krrng 1/1 Running 0 3h37m
cc-operator-pre-install-daemon-h9csb 0/1 Error 1 (4s ago) 8s
ubuntu@myvm:~/operator/tests/e2e$ sudo -E kubectl get pods -n confidential-containers-system
NAME READY STATUS RESTARTS AGE
cc-operator-controller-manager-ccbbcfdf7-bq9rx 2/2 Running 0 3h38m
cc-operator-daemon-install-krrng 1/1 Running 0 3h38m
cc-operator-pre-install-daemon-h9csb 0/1 Error 1 (10s ago) 14s
ubuntu@myvm:~/operator/tests/e2e$ sudo -E kubectl describe pods/cc-operator-pre-install-daemon-h9csb -n confidential-containers-system
Name: cc-operator-pre-install-daemon-h9csb
Namespace: confidential-containers-system
Priority: 0
Node: myvm/192.168.122.219
Start Time: Fri, 01 Dec 2023 18:39:40 +0000
Labels: controller-revision-hash=84f7757fb
name=cc-operator-pre-install-daemon
pod-template-generation=1
Annotations: <none>
Status: Running
IP: 10.244.0.9
IPs:
IP: 10.244.0.9
Controlled By: DaemonSet/cc-operator-pre-install-daemon
Containers:
cc-runtime-pre-install-pod:
Container ID: containerd://3efdb93b1284419722b67ec5261ee1eb368433a03d5e8fa1379181ffa2d53ea3
Image: localhost:5000/reqs-payload
Image ID: localhost:5000/reqs-payload@sha256:1d69e07328b90954fd88d67829a446813371ae49c18bcf285d4a27d31720ca31
Port: <none>
Host Port: <none>
Command:
/bin/sh
-c
/opt/confidential-containers-pre-install-artifacts/scripts/pre-install.sh
State: Terminated
Reason: Error
Exit Code: 1
Started: Fri, 01 Dec 2023 18:40:03 +0000
Finished: Fri, 01 Dec 2023 18:40:06 +0000
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Fri, 01 Dec 2023 18:39:44 +0000
Finished: Fri, 01 Dec 2023 18:39:47 +0000
Ready: False
Restart Count: 2
Environment:
NODE_NAME: (v1:spec.nodeName)
INSTALL_OFFICIAL_CONTAINERD: false
Mounts:
/etc/containerd/ from containerd-conf (rw)
/etc/systemd/system/ from etc-systemd-system (rw)
/opt/confidential-containers/ from confidential-containers-artifacts (rw)
/usr/local/bin/ from local-bin (rw)
/var/lib/containerd-nydus/ from containerd-nydus (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-vxkq5 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
confidential-containers-artifacts:
Type: HostPath (bare host directory volume)
Path: /opt/confidential-containers/
HostPathType: DirectoryOrCreate
etc-systemd-system:
Type: HostPath (bare host directory volume)
Path: /etc/systemd/system/
HostPathType:
containerd-conf:
Type: HostPath (bare host directory volume)
Path: /etc/containerd/
HostPathType:
local-bin:
Type: HostPath (bare host directory volume)
Path: /usr/local/bin/
HostPathType:
containerd-nydus:
Type: HostPath (bare host directory volume)
Path: /var/lib/containerd-nydus/
HostPathType:
kube-api-access-vxkq5:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: node.kubernetes.io/worker=
Tolerations: node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 32s default-scheduler Successfully assigned confidential-containers-system/cc-operator-pre-install-daemon-h9csb to myvm
Normal Pulled 31s kubelet Successfully pulled image "localhost:5000/reqs-payload" in 116.238729ms
Normal Pulled 28s kubelet Successfully pulled image "localhost:5000/reqs-payload" in 136.977628ms
Normal Pulling 9s (x3 over 31s) kubelet Pulling image "localhost:5000/reqs-payload"
Normal Created 9s (x3 over 31s) kubelet Created container cc-runtime-pre-install-pod
Normal Started 9s (x3 over 31s) kubelet Started container cc-runtime-pre-install-pod
Normal Pulled 9s kubelet Successfully pulled image "localhost:5000/reqs-payload" in 115.467434ms
Warning BackOff 6s (x2 over 24s) kubelet Back-off restarting failed container |
Both uninstall and reinstall tests (see |
@stevenhorsman could you please cherry-pick the two commits on main...wainersm:cc-operator:operator_tests_debug ? If they don't fix the issue, they at least make the uninstall test better. |
When printing all pods on teardown(), the namespace should be confidential-containers-system (was missing the -system suffix). While here, added command to print all DaemonSet (DS) on that namespace because sometimes the DS is created correctly but pods aren't scheduled for some unmet conditions (e.g. missing 'node.kubernetes.io/worker' node's labels). Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
Added checkers to ensure the operator was uninstalled correctly. Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
/test |
Woo! Great job @wainersm - things are working now! |
Remove the kata-containers
CCv0
e2e tests in the operator as these tests are on a dead branch of a dead repo We also have had issues with them not being suitable as they programmatically change the configuration, which isn't likely to work in GHA, but isn't what we want to test either.Later we should add an e2e smoke test that ensures that the kata-payload integrated with things like nydus-snapshotter than the operator is responsible for setting up and configuring as discussed in #239