-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DFC uninstall flow #55
Conversation
ezio-auditore
commented
May 4, 2023
•
edited
Loading
edited
- checks for PVs using ocs storage classes and restricts delete if found
- delete default storageclass claims
- delete storageclient
29b0c2a
to
0c5935f
Compare
0c5935f
to
af45798
Compare
af45798
to
31b048f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ezio-auditore I don't see a function that will indicate whether the offering is safe to be removed or not. We should have dfcIsReadyToBeRemoved
function which will be called in pluginIsReadyToBeRemoved
function.
09c5614
to
fea9ac2
Compare
|
fea9ac2
to
6470baf
Compare
6470baf
to
387e4d1
Compare
387e4d1
to
9e61b17
Compare
controllers/datafoundationclient.go
Outdated
} | ||
} | ||
pvList := &corev1.PersistentVolumeList{} | ||
if err := r.list(pvList); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that this will only list PVs from the set of namespaces the operator group allows for it.
We need to list all PVs! for that I believe you can use the Unrestricted Client
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to fetch using UnrestrictedClient
controllers/datafoundationclient.go
Outdated
ocsStorageClass := map[string]bool{} | ||
for i := range storageClassList.Items { | ||
storageClass := storageClassList.Items[i] | ||
if storageClass.Provisioner == r.offering.Namespace+ocsCephFsProvisionerSuffix || storageClass.Provisioner == r.offering.Namespace+ocsRBDProvisionerSuffix { | ||
ocsStorageClass[storageClass.Name] = true | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to calculate the provisioners name on each loop iteration?
ocsStorageClass := map[string]bool{} | |
for i := range storageClassList.Items { | |
storageClass := storageClassList.Items[i] | |
if storageClass.Provisioner == r.offering.Namespace+ocsCephFsProvisionerSuffix || storageClass.Provisioner == r.offering.Namespace+ocsRBDProvisionerSuffix { | |
ocsStorageClass[storageClass.Name] = true | |
} | |
} | |
ocsStorageClass := map[string]bool{} | |
rbdProvisionerName = r.offering.Namespace + ocsRBDProvisionerSuffix | |
fsProvisionerName := r.offering.Namespace + ocsCephFsProvisionerSuffix | |
for i := range storageClassList.Items { | |
storageClass := storageClassList.Items[i] | |
if storageClass.Provisioner == rbdProvisionerName || storageClass.Provisioner == fsProvisionerName { | |
ocsStorageClass[storageClass.Name] = true | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
9b9151c
to
e00697f
Compare
r.Log.Info("issuing a delete for Default StorageClassClaims") | ||
if err := r.delete(&r.defaultBlockStorageClassClaim); err != nil { | ||
return ctrl.Result{}, fmt.Errorf("failed to issue a delete for defaultBlockStorageClassClaim: %v", err) | ||
} | ||
if err := r.delete(&r.defaultFSStorageClassClaim); err != nil { | ||
return ctrl.Result{}, fmt.Errorf("failed to issue a delete for defaultFSStorageClassClaim: %v", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we delete non-default storageClassClaims as well that are provisioned by cephfs.csi.ceph.com
or rbd.csi.ceph.com
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a very good question! @ezio-auditore what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default storageclient spec
Are we even allowing customer to create non-default storageClient? Or do we have plans to allow customer to create non-default storageClients?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the user creates his custom storageClassClaim using the the default storageclient spec
@ezio-auditore that does not make any sense. The reason someone will create another claim is if they want some aspects of the storageclass to be different
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nb-ohad, the ocs-client-operator is already checking for any PV before deleting the SCC. Why do we need to recheck the PV here again?
The question here was about SCC, I think we should just delete the lot of them
You mean list all the SCC, delete them, and wait for them to be deleted before deleting the storage client right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we support Users creating custom storageclass and static provisioning without SCC in managed service?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we support Users creating custom storageclass and static provisioning without SCC in managed service?
On consumer clusters, customer can do anything they want to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean list all the SCC, delete them, and wait for them to be deleted before deleting the storage client right?
I think we can issue a delete for all SCCs and then issue a delete for storageClient. No need to wait for the removal of all SCCs before deleting storageClient because client-operator won't delete storageClient unless all SCCs are removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Madhu-1 I am aware the ocs-client-operator is already checking for that.
But a k8s delete operation is unrevertable so we check for potential lock conditions, from the operator that owns the lifecycle of the offering (in this case the client operator) before issuing the delete request.
e00697f
to
1649216
Compare
Signed-off-by: Kaustav Majumder <[email protected]>
1649216
to
5a30712
Compare
) | ||
|
||
//+kubebuilder:rbac:groups=ocs.openshift.io,resources=storageclients,verbs=get;list;watch;create;update;patch;delete | ||
//+kubebuilder:rbac:groups=ocs.openshift.io,resources=storageclassclaims,verbs=get;list;watch;create;update;patch;delete | ||
//+kubebuilder:rbac:groups=operators.coreos.com,resources=clusterserviceversions,verbs=get;list;watch;delete;update;patch | ||
//+kubebuilder:rbac:groups="",resources=persistentvolumes,verbs=get;list;watch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this give us enough privileges to read and delete over all namespaces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it gives us enough privileges
) | ||
|
||
//+kubebuilder:rbac:groups=ocs.openshift.io,resources=storageclients,verbs=get;list;watch;create;update;patch;delete | ||
//+kubebuilder:rbac:groups=ocs.openshift.io,resources=storageclassclaims,verbs=get;list;watch;create;update;patch;delete | ||
//+kubebuilder:rbac:groups=operators.coreos.com,resources=clusterserviceversions,verbs=get;list;watch;delete;update;patch | ||
//+kubebuilder:rbac:groups="",resources=persistentvolumes,verbs=get;list;watch | ||
//+kubebuilder:rbac:groups="storage.k8s.io",resources=storageclass,verbs=get;list;watch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need access to storageclass? Doesn't uninstallation of ocs-client-operator already take care of deletion of storageclassclaims and as a result the attached storageclasses?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need access to get, list and watch storageClass to fetch the PVs that are using ocs storageclasses
r.defaultBlockStorageClassClaim.Name = defaultBlockStorageClassClaimName | ||
|
||
r.defaultFSStorageClassClaim.Name = "ocs-storagecluster-cephfs" | ||
r.defaultFSStorageClassClaim.Name = defaultFSStorageClassClaimName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why we are treating these storageclassclaims differently then ant other storagelcassclaim?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the default storageClassClaims that the agent is creating.
r.Log.Info("issuing a delete for Default StorageClassClaims") | ||
if err := r.delete(&r.defaultBlockStorageClassClaim); err != nil { | ||
return ctrl.Result{}, fmt.Errorf("failed to issue a delete for defaultBlockStorageClassClaim: %v", err) | ||
} | ||
if err := r.delete(&r.defaultFSStorageClassClaim); err != nil { | ||
return ctrl.Result{}, fmt.Errorf("failed to issue a delete for defaultFSStorageClassClaim: %v", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existence of storageclassclaims or even storageclasses does not infer usage of storage. I am against blocking uninstallation based on that
rbdProvisionerName := r.offering.Namespace + ocsRBDProvisionerSuffix | ||
fsProvisionerName := r.offering.Namespace + ocsCephFsProvisionerSuffix | ||
for i := range storageClassList.Items { | ||
storageClass := storageClassList.Items[i] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To prevent a copy operation, please take a reference
storageClass := storageClassList.Items[i] | |
storageClass := &storageClassList.Items[i] |
rbdProvisionerName := r.offering.Namespace + ocsRBDProvisionerSuffix | ||
fsProvisionerName := r.offering.Namespace + ocsCephFsProvisionerSuffix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ocsRBDProvisionerSuffix
and ocsCephFsProvisionerSuffix
are only used once here. Can we just embed the values?
} | ||
pvList := &corev1.PersistentVolumeList{} | ||
pvCount := 0 | ||
if err := r.UnrestrictedClient.List(r.ctx, pvList); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we had an r.UnrestrictedList(pvList)
operation.
if err := r.list(storageClassList); err != nil { | ||
return false, fmt.Errorf("failed to list Storage Classes: %v", err) | ||
} | ||
ocsStorageClass := map[string]bool{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ocsStorageClass := map[string]bool{} | |
ocsStorageClasses := map[string]bool{} |
if pvCount > 0 { | ||
r.Log.Info(fmt.Sprintf("Current count of PVs blocking uninstallation: %v", pvCount)) | ||
return true, nil | ||
} | ||
|
||
return false, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not make sense to have logic/logging of messages base on the result of this func inside the func itself.
Let's have this function be countOCSVolumes
which returns a number and handles the decision-making process outside.
Log: ctrl.Log.WithName("controllers").WithName("ManagedFusionOffering"), | ||
Scheme: mgr.GetScheme(), | ||
Client: mgr.GetClient(), | ||
UnrestrictedClient: getUnrestrictedClient(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just create the unrestricted client once and rescue it in all controllers?
closing since another PR #63 already addressing this |