Skip to content
This repository has been archived by the owner on Oct 22, 2024. It is now read-only.

operator: Bump operator-sdk version to v1.13.1 #1051

Closed
wants to merge 4 commits into from

Conversation

avalluri
Copy link
Contributor

The most recent operator-sdk version is 1.14.0 which requires k8s 1.22. We can upgrade to 1.14 as part of adding 1.22 support to the driver.

This reverts commit 6822551.

Needed commits are merged to upstream.
@pohly
Copy link
Contributor

pohly commented Nov 15, 2021

which requires k8s 1.22

Does it require packages from 1.22 or does the resulting PMEM-CSI only work with 1.22?

We can upgrade to 1.14 as part of adding 1.22 support to the driver.

Supporting 1.22 does not mean that we compile against it. To minimize changes, in #1049 I am only changing deployment and testing, but not any compilation steps.

@avalluri
Copy link
Contributor Author

which requires k8s 1.22

Does it require packages from 1.22 or does the resulting PMEM-CSI only work with 1.22?

It requires 1.22 packages and controller-runtime 0.7.0. But I don't think it restricts the operator to work only 1.22.

@pohly
Copy link
Contributor

pohly commented Nov 15, 2021

Does using operator-sdk v1.13.1 have any specific advantages? It still doesn't seem to work with the latest OLM.

@avalluri
Copy link
Contributor Author

Does using operator-sdk v1.13.1 have any specific advantages? It still doesn't seem to work with the latest OLM.

I am still debugging it...

Somehow latest release(v0.19.1) fails to start the operator, it stuck
while creating the install-plan for the subscription. The logs from
the catalog-source pod shows:

```
time="2021-11-15T13:51:07Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=
time="2021-11-15T13:51:07Z" level=info msg="syncing catalog source for annotation templates" catSrcName=pmem-csi-operator-catalog catSrcNamespace=default id=da6p4
time="2021-11-15T13:51:08Z" level=warning msg="an error was encountered during reconciliation" error="Operation cannot be fulfilled on subscriptions.operators.coreos.com \"pmem-csi-operator-v100-0-0-sub\": the object has been modified; please apply your changes to the latest version and try again" event=update reconciling="*v1alpha1.Subscription" selflink=
E1115 13:51:08.483478       1 queueinformer_operator.go:290] sync {"update" "default/pmem-csi-operator-v100-0-0-sub"} failed: Operation cannot be fulfilled on subscriptions.operators.coreos.com "pmem-csi-operator-v100-0-0-sub": the object has been modified; please apply your changes to the latest version and try again
```
@avalluri
Copy link
Contributor Author

OLM Installation failure is a bit strange:

    + /mnt/workspace/pmem-csi_PR-1051/_work/bin/operator-sdk olm install --verbose --version=0.18.3 --timeout=5m
    Installing OLM (0.18.3)...
    time="2021-11-15T18:19:16Z" level=debug msg="Debug logging is set"
    time="2021-11-15T18:19:16Z" level=info msg="Fetching CRDs for version \"0.18.3\""
    time="2021-11-15T18:19:16Z" level=info msg="Fetching resources for resolved version \"v0.18.3\""
    I1115 18:19:18.668569   31934 request.go:665] Waited for 1.008305395s due to client-side throttling, not priority and fairness, request: GET:https://172.17.0.4:6443/apis/node.k8s.io/v1?timeout=32s
    time="2021-11-15T18:19:25Z" level=info msg="Creating CRDs and resources"
    time="2021-11-15T18:19:25Z" level=info msg="  Creating CustomResourceDefinition \"catalogsources.operators.coreos.com\""
    time="2021-11-15T18:19:25Z" level=info msg="  Creating CustomResourceDefinition \"clusterserviceversions.operators.coreos.com\""
    time="2021-11-15T18:19:25Z" level=info msg="  Creating CustomResourceDefinition \"installplans.operators.coreos.com\""
    time="2021-11-15T18:19:25Z" level=info msg="  Creating CustomResourceDefinition \"operatorconditions.operators.coreos.com\""
    time="2021-11-15T18:19:25Z" level=info msg="  Creating CustomResourceDefinition \"operatorgroups.operators.coreos.com\""
    time="2021-11-15T18:19:25Z" level=info msg="  Creating CustomResourceDefinition \"operators.operators.coreos.com\""
    time="2021-11-15T18:19:25Z" level=info msg="  Creating CustomResourceDefinition \"subscriptions.operators.coreos.com\""
    time="2021-11-15T18:19:25Z" level=info msg="  Creating Namespace \"olm\""
    time="2021-11-15T18:19:26Z" level=info msg="  Creating Namespace \"operators\""
    time="2021-11-15T18:19:26Z" level=info msg="  Creating ServiceAccount \"olm/olm-operator-serviceaccount\""
    time="2021-11-15T18:19:26Z" level=info msg="  Creating ClusterRole \"system:controller:operator-lifecycle-manager\""
    time="2021-11-15T18:19:26Z" level=info msg="  Creating ClusterRoleBinding \"olm-operator-binding-olm\""
    time="2021-11-15T18:19:26Z" level=info msg="  Creating Deployment \"olm/olm-operator\""
    time="2021-11-15T18:19:26Z" level=info msg="  Creating Deployment \"olm/catalog-operator\""
    time="2021-11-15T18:19:26Z" level=info msg="  Creating ClusterRole \"aggregate-olm-edit\""
    time="2021-11-15T18:19:26Z" level=info msg="  Creating ClusterRole \"aggregate-olm-view\""
    time="2021-11-15T18:19:26Z" level=info msg="  Creating OperatorGroup \"operators/global-operators\""
    time="2021-11-15T18:19:27Z" level=fatal msg="Failed to install OLM version \"0.18.3\": failed to create CRDs and resources: no matches for kind \"OperatorGroup\" in version \"operators.coreos.com/v1\""
    + /mnt/workspace/pmem-csi_PR-1051/_work/bin/operator-sdk olm status
    time="2021-11-15T18:19:30Z" level=fatal msg="Failed to get OLM status: error getting installed OLM version (set --version to override the default version): no existing installation found"
    + echo 'OLM installation failed!!!'
    OLM installation failed!!!

I guess some timing issue. Even though the OperatorGroup CRD was installed in the previous step, it fails to create an object of that kind.

@pohly
Copy link
Contributor

pohly commented Nov 17, 2021

That failure to install OLM is different from the failure seen in the CI. In the CI, OLM installed okay but the PMEM-CSI operator didn't.

@pohly
Copy link
Contributor

pohly commented Nov 17, 2021

See #1050

@@ -1,15 +1,9 @@
OPERATOR_SDK_VERSION=1.6.1
OPERATOR_SDK_VERSION=1.14.0

# download operator-sdk binary
_work/bin/operator-sdk-$(OPERATOR_SDK_VERSION):
mkdir -p _work/bin/ 2> /dev/null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the output redirection to /dev/null?

If there is an error, we want to see it.

git clone --branch $(OPERATOR_SDK_VERSION)+fixes https://github.com/avalluri/operator-sdk.git $$tmpdir && \
cd $$tmpdir && $(MAKE) build/operator-sdk && \
cp $$tmpdir/build/operator-sdk $(abspath $@) && \
curl -L https://github.com/operator-framework/operator-sdk/releases/download/v$(OPERATOR_SDK_VERSION)/operator-sdk_linux_amd64 -o $(abspath $@)
chmod a+x $(abspath $@)
cd $(dir $@); ln -sf operator-sdk-$(OPERATOR_SDK_VERSION) operator-sdk
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These rules are unnecessarily complex. This is simpler:

_work/bin/operator-sdk-$(OPERATOR_SDK_VERSION):
	mkdir -p _work/bin
	curl -L https://github.com/operator-framework/operator-sdk/releases/download/v$(OPERATOR_SDK_VERSION)/operator-sdk_linux_amd64 -o $@
	chmod a+x $@
	ln -sf operator-sdk-$(OPERATOR_SDK_VERSION) $(@D)/operator-sdk

@pohly
Copy link
Contributor

pohly commented Nov 17, 2021

The new operator-sdk is unhappy about our bundle. I checked out your branch and ran:

$ make operator-generate
go: creating new go.mod: module tmp
go get: installing executables with 'go get' in module mode is deprecated.
	To adjust and download dependencies of the current module, use 'go get -d'.
	To install using requirements of the current module, use 'go install'.
	To install ignoring the current module, use 'go install' with a version,
	like 'go install example.com/cmd@latest'.
	For more information, see https://golang.org/doc/go-get-install-deprecation
	or run 'go help get' or 'go help install'.
go get: added github.com/fatih/color v1.7.0
go get: added github.com/gobuffalo/flect v0.2.0
go get: added github.com/gogo/protobuf v1.3.1
go get: added github.com/google/gofuzz v1.1.0
go get: added github.com/inconshreveable/mousetrap v1.0.0
go get: added github.com/json-iterator/go v1.1.8
go get: added github.com/mattn/go-colorable v0.1.2
go get: added github.com/mattn/go-isatty v0.0.8
go get: added github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd
go get: added github.com/modern-go/reflect2 v1.0.1
go get: added github.com/spf13/cobra v0.0.5
go get: added github.com/spf13/pflag v1.0.5
go get: added golang.org/x/net v0.0.0-20191004110552-13f9640d40b9
go get: added golang.org/x/sys v0.0.0-20191022100944-742c48ecaeb7
go get: added golang.org/x/text v0.3.2
go get: added golang.org/x/tools v0.0.0-20190920225731-5eefd052ad72
go get: added gopkg.in/inf.v0 v0.9.1
go get: added gopkg.in/yaml.v2 v2.2.8
go get: added gopkg.in/yaml.v3 v3.0.0-20190905181640-827449938966
go get: added k8s.io/api v0.18.2
go get: added k8s.io/apiextensions-apiserver v0.18.2
go get: added k8s.io/apimachinery v0.18.2
go get: added k8s.io/klog v1.0.0
go get: added k8s.io/utils v0.0.0-20200324210504-a9aa75ae1b89
go get: added sigs.k8s.io/controller-tools v0.3.0
go get: added sigs.k8s.io/structured-merge-diff/v3 v3.0.0
go get: added sigs.k8s.io/yaml v1.2.0
Generating CRD in /nvme/gopath/src/github.com/intel/pmem-csi/deploy/crd ...
cd /nvme/gopath/src/github.com/intel/pmem-csi && /nvme/gopath/bin/controller-gen crd:trivialVersions=true,crdVersions=v1 paths=./pkg/apis/... output:dir=/nvme/gopath/src/github.com/intel/pmem-csi/deploy/crd/
version=" v0.3.0"; \
    sed -i "1s/^/# This file was generated by controller-gen$version via 'make operator-generate-crd'\n/" /nvme/gopath/src/github.com/intel/pmem-csi/deploy/crd/*
mkdir -p _work/bin/ 2> /dev/null
curl -L https://github.com/operator-framework/operator-sdk/releases/download/v1.14.0/operator-sdk_linux_amd64 -o /nvme/gopath/src/github.com/intel/pmem-csi/_work/bin/operator-sdk-1.14.0
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   663  100   663    0     0   3662      0 --:--:-- --:--:-- --:--:--  3683
100 73.6M  100 73.6M    0     0  11.8M      0  0:00:06  0:00:06 --:--:-- 19.3M
chmod a+x /nvme/gopath/src/github.com/intel/pmem-csi/_work/bin/operator-sdk-1.14.0
cd _work/bin/; ln -sf operator-sdk-1.14.0 operator-sdk
Generating operator bundle in /nvme/gopath/src/github.com/intel/pmem-csi/deploy/olm-bundle/100.0.0 ...
Generating bundle version 100.0.0
Generating bundle manifests
Bundle manifests generated successfully in /nvme/gopath/src/github.com/intel/pmem-csi/deploy/olm-bundle/100.0.0
Generating bundle metadata
INFO[0000] Creating bundle.Dockerfile                   
INFO[0000] Creating /nvme/gopath/src/github.com/intel/pmem-csi/deploy/olm-bundle/100.0.0/metadata/annotations.yaml 
INFO[0000] Bundle metadata generated suceessfully       
sed -i -e 's;X\.Y\.Z;100.0.0;g' -e 's;X\.Y;100.0;g' /nvme/gopath/src/github.com/intel/pmem-csi/deploy/olm-bundle/100.0.0/manifests/pmem-csi-operator.clusterserviceversion.yaml
sed -i -e 's;\(.*createdAt: \).*;\12021-11-17T10:55:12Z;g' /nvme/gopath/src/github.com/intel/pmem-csi/deploy/olm-bundle/100.0.0/manifests/pmem-csi-operator.clusterserviceversion.yaml
make[1]: Entering directory '/nvme/gopath/src/github.com/intel/pmem-csi'
ERROR: Operator bundle did not pass validation:
time="2021-11-17T10:55:21+01:00" level=error msg="Error: Value pmem-csi-operator: invalid service account found in bundle. sa name cannot match service account defined for deployment spec in CSV"
make[1]: *** [operator/operator.make:91: operator-validate-bundle] Error 1
make[1]: Leaving directory '/nvme/gopath/src/github.com/intel/pmem-csi'
make: *** [operator/operator.make:84: operator-generate-bundle] Error 2

Why is controller-gen coming from my $GOPATH/bin? Shouldn't that also be downloaded with a known-good version?

@pohly
Copy link
Contributor

pohly commented Nov 17, 2021

Ah, controller-gen is getting installed with go install so it is kind of controlled.

@pohly
Copy link
Contributor

pohly commented Nov 17, 2021

Here's a similar PR that works for me, please review: #1052

@avalluri avalluri closed this Nov 18, 2021
@avalluri
Copy link
Contributor Author

Now. the new build (operator-sdk v1.14.0 and OLM v0.18.3) succeeds without any fialures. But. anyways as #1052 addresses the same issue I am closing in favor of that.

@pohly
Copy link
Contributor

pohly commented Nov 18, 2021

I'm confused. Why are you not getting the "invalid service account found in bundle. sa name cannot match service account defined for deployment spec in CSV"?

Is it because you switched to operator-sdk 1.14.0?

We should have an explanation for this instead of finding a working combination via trial-and-error.

@pohly
Copy link
Contributor

pohly commented Nov 18, 2021

The most recent operator-sdk version is 1.14.0 which requires k8s 1.22. We can upgrade to 1.14 as part of adding 1.22 support to the driver.

That was a false herring, right? It doesn't matter at all which Kubernetes dependencies the operator-sdk has. What matters is its output.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants