Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrating from policy.type kyverno to none sometimes (?) fails #335

Open
ralgozino opened this issue Jan 14, 2025 · 1 comment · May be fixed by #336
Open

Migrating from policy.type kyverno to none sometimes (?) fails #335

ralgozino opened this issue Jan 14, 2025 · 1 comment · May be fixed by #336
Assignees
Labels
bug Something isn't working question Further information is requested

Comments

@ralgozino
Copy link
Member

Describe the bug

Changing the spec.distribution.modules.policy.type value from kyverno to none to uninstall Kyverno SOMETIMES fails with the following error:

furyctl apply -c v1.31.0.yaml -p distribution
INFO Downloading distribution...
INFO Validating configuration file...
INFO Downloading dependencies...
INFO Running preflight checks...
INFO Checking that the cluster is reachable...
INFO Cluster configuration has changed, checking for immutable violations...
INFO Cluster configuration has changed, checking for unsupported reducers violations...
INFO Cluster configuration has changed, checking if changes are supported in the current phase...
INFO Preflight checks completed successfully
INFO changes to the policy module type have been detected. This will cause the reconfiguration or deletion of the current policy stack.
INFO Differences found from previous cluster configuration, handling the following changes:
.spec.distribution.modules.policy.type: kyverno -> none
INFO Running preupgrade phase...
INFO Preupgrade phase completed successfully
WARNING: You are about to apply changes to the cluster configuration.
Are you sure you want to continue? Only 'yes' will be accepted to confirm.
yes
INFO Installing Kubernetes Fury Distribution...
INFO Checking that the cluster is reachable...
INFO Checking storage classes...
INFO Applying manifests...
ERRO error while creating cluster: error while executing distribution phase: error while executing phase: error running core distribution phase: error running pre-apply reducers: error applying manifests: error while running shell: /bin/sh sh /Users/<EDITED>/.furyctl/multipass/distribution/scripts/pre-apply.sh: command failed - exit status 1
out: namespace "kyverno" deleted
customresourcedefinition.apiextensions.k8s.io "admissionreports.kyverno.io" deleted
customresourcedefinition.apiextensions.k8s.io "backgroundscanreports.kyverno.io" deleted
customresourcedefinition.apiextensions.k8s.io "cleanuppolicies.kyverno.io" deleted
customresourcedefinition.apiextensions.k8s.io "clusteradmissionreports.kyverno.io" deleted
customresourcedefinition.apiextensions.k8s.io "clusterbackgroundscanreports.kyverno.io" deleted
customresourcedefinition.apiextensions.k8s.io "clustercleanuppolicies.kyverno.io" deleted
customresourcedefinition.apiextensions.k8s.io "clusterephemeralreports.reports.kyverno.io" deleted
customresourcedefinition.apiextensions.k8s.io "clusterpolicies.kyverno.io" deleted
customresourcedefinition.apiextensions.k8s.io "clusterpolicyreports.wgpolicyk8s.io" deleted
customresourcedefinition.apiextensions.k8s.io "ephemeralreports.reports.kyverno.io" deleted
customresourcedefinition.apiextensions.k8s.io "globalcontextentries.kyverno.io" deleted
customresourcedefinition.apiextensions.k8s.io "policies.kyverno.io" deleted
customresourcedefinition.apiextensions.k8s.io "policyexceptions.kyverno.io" deleted
customresourcedefinition.apiextensions.k8s.io "policyreports.wgpolicyk8s.io" deleted
customresourcedefinition.apiextensions.k8s.io "updaterequests.kyverno.io" deleted
serviceaccount "kyverno-admission-controller" deleted
serviceaccount "kyverno-background-controller" deleted
serviceaccount "kyverno-cleanup-controller" deleted
serviceaccount "kyverno-cleanup-jobs" deleted
serviceaccount "kyverno-reports-controller" deleted
role.rbac.authorization.k8s.io "kyverno:admission-controller" deleted
role.rbac.authorization.k8s.io "kyverno:background-controller" deleted
role.rbac.authorization.k8s.io "kyverno:cleanup-controller" deleted
role.rbac.authorization.k8s.io "kyverno:reports-controller" deleted
clusterrole.rbac.authorization.k8s.io "kyverno:admission-controller" deleted
clusterrole.rbac.authorization.k8s.io "kyverno:admission-controller:core" deleted
clusterrole.rbac.authorization.k8s.io "kyverno:background-controller" deleted
clusterrole.rbac.authorization.k8s.io "kyverno:background-controller:core" deleted
clusterrole.rbac.authorization.k8s.io "kyverno:cleanup-controller" deleted
clusterrole.rbac.authorization.k8s.io "kyverno:cleanup-controller:core" deleted
clusterrole.rbac.authorization.k8s.io "kyverno:cleanup-jobs" deleted
clusterrole.rbac.authorization.k8s.io "kyverno:rbac:admin:policies" deleted
clusterrole.rbac.authorization.k8s.io "kyverno:rbac:admin:policyreports" deleted
clusterrole.rbac.authorization.k8s.io "kyverno:rbac:admin:reports" deleted
clusterrole.rbac.authorization.k8s.io "kyverno:rbac:admin:updaterequests" deleted
clusterrole.rbac.authorization.k8s.io "kyverno:rbac:view:policies" deleted
clusterrole.rbac.authorization.k8s.io "kyverno:rbac:view:policyreports" deleted
clusterrole.rbac.authorization.k8s.io "kyverno:rbac:view:reports" deleted
clusterrole.rbac.authorization.k8s.io "kyverno:rbac:view:updaterequests" deleted
clusterrole.rbac.authorization.k8s.io "kyverno:reports-controller" deleted
clusterrole.rbac.authorization.k8s.io "kyverno:reports-controller:core" deleted
rolebinding.rbac.authorization.k8s.io "kyverno:admission-controller" deleted
rolebinding.rbac.authorization.k8s.io "kyverno:background-controller" deleted
rolebinding.rbac.authorization.k8s.io "kyverno:cleanup-controller" deleted
rolebinding.rbac.authorization.k8s.io "kyverno:reports-controller" deleted
clusterrolebinding.rbac.authorization.k8s.io "kyverno:admission-controller" deleted
clusterrolebinding.rbac.authorization.k8s.io "kyverno:background-controller" deleted
clusterrolebinding.rbac.authorization.k8s.io "kyverno:cleanup-controller" deleted
clusterrolebinding.rbac.authorization.k8s.io "kyverno:cleanup-jobs" deleted
clusterrolebinding.rbac.authorization.k8s.io "kyverno:reports-controller" deleted
configmap "kyverno" deleted
configmap "kyverno-metrics" deleted
service "kyverno-background-controller-metrics" deleted
service "kyverno-cleanup-controller" deleted
service "kyverno-cleanup-controller-metrics" deleted
service "kyverno-reports-controller-metrics" deleted
service "kyverno-svc" deleted
service "kyverno-svc-metrics" deleted
deployment.apps "kyverno-admission-controller" deleted
deployment.apps "kyverno-background-controller" deleted
deployment.apps "kyverno-cleanup-controller" deleted
deployment.apps "kyverno-reports-controller" deleted
cronjob.batch "kyverno-cleanup-admission-reports" deleted
cronjob.batch "kyverno-cleanup-cluster-admission-reports" deleted

err: resource mapping not found for name: "disallow-capabilities" namespace: "kyverno" from "STDIN": no matches for kind "ClusterPolicy" in version "kyverno.io/v1"
ensure CRDs are installed first
resource mapping not found for name: "disallow-capabilities-strict" namespace: "kyverno" from "STDIN": no matches for kind "ClusterPolicy" in version "kyverno.io/v1"
ensure CRDs are installed first
resource mapping not found for name: "disallow-host-namespaces" namespace: "kyverno" from "STDIN": no matches for kind "ClusterPolicy" in version "kyverno.io/v1"
ensure CRDs are installed first
resource mapping not found for name: "disallow-host-path" namespace: "kyverno" from "STDIN": no matches for kind "ClusterPolicy" in version "kyverno.io/v1"
ensure CRDs are installed first
resource mapping not found for name: "disallow-host-ports" namespace: "kyverno" from "STDIN": no matches for kind "ClusterPolicy" in version "kyverno.io/v1"
ensure CRDs are installed first
resource mapping not found for name: "disallow-latest-tag" namespace: "kyverno" from "STDIN": no matches for kind "ClusterPolicy" in version "kyverno.io/v1"
ensure CRDs are installed first
resource mapping not found for name: "disallow-privilege-escalation" namespace: "kyverno" from "STDIN": no matches for kind "ClusterPolicy" in version "kyverno.io/v1"
ensure CRDs are installed first
resource mapping not found for name: "disallow-privileged-containers" namespace: "kyverno" from "STDIN": no matches for kind "ClusterPolicy" in version "kyverno.io/v1"
ensure CRDs are installed first
resource mapping not found for name: "disallow-proc-mount" namespace: "kyverno" from "STDIN": no matches for kind "ClusterPolicy" in version "kyverno.io/v1"
ensure CRDs are installed first
resource mapping not found for name: "require-pod-probes" namespace: "kyverno" from "STDIN": no matches for kind "ClusterPolicy" in version "kyverno.io/v1"
ensure CRDs are installed first
resource mapping not found for name: "require-run-as-nonroot" namespace: "kyverno" from "STDIN": no matches for kind "ClusterPolicy" in version "kyverno.io/v1"
ensure CRDs are installed first
resource mapping not found for name: "restrict-sysctls" namespace: "kyverno" from "STDIN": no matches for kind "ClusterPolicy" in version "kyverno.io/v1"
ensure CRDs are installed first
resource mapping not found for name: "unique-ingress-host-and-path" namespace: "kyverno" from "STDIN": no matches for kind "ClusterPolicy" in version "kyverno.io/v1"
ensure CRDs are installed first

To Reproduce

  1. Set policy type to kyverno and deploy the cluster.
  2. Change the policy type from kyverno to none and re-apply the config.
  3. The deletion process will fail with the error above.

You might need to repeat the process until the bug triggers. I could not replicate it systematically.

Is like the --ignore-not-found flag is not respected sometimes?

Expected behavior

Kyverno should be uninstalled without errors.

Desktop (please complete the following information):

  • OS: macOS
  • furyctl version: 0.31.0

Kubernetes (please complete the following information):

  • Kubernetes version: 1.31.0
  • KFD: v1.31.0

Additional context

I'll investigate this more and update the issue if I find anything else. I don't discard something weird in my test environment.

I found this bug while working on #334

@ralgozino ralgozino added bug Something isn't working question Further information is requested labels Jan 14, 2025
@ralgozino
Copy link
Member Author

ralgozino commented Jan 15, 2025

So, this is what I think is happening:

  1. The migrations execute the following command:

$kustomizebin build $vendorPath/modules/opa/katalog/kyverno | $kubectlbin delete --ignore-not-found --wait --timeout=180s -f -

Notice that the command builds the whole kyverno kustomize base and passes it to kubectl delete. In the output of kustomize build you get both the CRDs and the policies that use the kinds created by the CRDs, that are then deleted with kubectl.

  1. I believe sometimes we hit a race condition where the APIs are deleted before the resources using them because the CRDs are deleted first.

In this case kubectl delete will fail because it does not know the API (the no matches for kind "ClusterPolicy" in version "kyverno.io/v1" error that is present in the logs), and this error case is not covered by the --ignore-not-found flag that ignores only the resource not found errors.

The issue is not present always because the API server needs some time to remove the APIs after the CRDs have been deleted, so sometimes the kubectl delete command success because the API are still present even though the CRDs have been deleted.

Notice that the ClusterPolicy objects are being deleted anyway when the CRD is deleted.

Changing the command to delete first the cluster policies objects and then the CRDs should solve the issue.

@ralgozino ralgozino self-assigned this Jan 15, 2025
ralgozino added a commit that referenced this issue Jan 15, 2025
When switching from kyverno to none, delete first the objects using the
APIs of the CRDs and then the CRDs. Otherwise we could sometimes end up
in a race condition where the objects cannot be deleted because the APIs
are not available anymore.

Fixes #335
@ralgozino ralgozino linked a pull request Jan 16, 2025 that will close this issue
3 tasks
ralgozino added a commit that referenced this issue Jan 16, 2025
When switching from kyverno to none, delete first the objects using the
APIs of the CRDs and then the CRDs. Otherwise we could sometimes end up
in a race condition where the objects cannot be deleted because the APIs
are not available anymore.

Fixes #335
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant