Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random Storage Class Selection During Host-Assisted Snapshot Restore #3530

Open
TheRealSibasishBehera opened this issue Nov 19, 2024 · 9 comments
Labels

Comments

@TheRealSibasishBehera
Copy link

What happened:
During host-assisted snapshot restore operations, CDI creates intermediate PVCs with randomly selected storage class (matching only the CSI driver), ignoring both the DataVolume's specified storage class and the source PVC's storage class. This leads to restore failures when an incompatible storage class is selected.

Error seen:

in the host-assisted pvc

ProvisioningFailed: failed to provision volume with StorageClass "ms-ext4-3-replicas-local": 
rpc error: code = InvalidArgument desc = error in response: status code '400 Bad Request', content: 'RestJsonError { details: "", message: "SvcError :: ClonedSnapshotVolumeThin: Cloned snapshot volumes must be thin provisioned", kind: InvalidArgument }'

What you expected to happen:

CDI should follow a predictable storage class selection priority during host-assisted snapshot restore:

  • Use the storage class specified in the DataVolume
  • Fall back to source PVC's storage class
  • Use default storage class with matching driver
  • Only as last resort, use any matching storage class

How to reproduce it (as minimally and precisely as possible):
Steps to reproduce the behavior.

Create a snapshot from existing PVC:

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: blank-datavolume-snapshot
spec:
  source:
    persistentVolumeClaimName: blank-datavolume

Create DataVolume to restore from snapshot:

apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
  name: new-datavolume-from-snapshot
spec:
  source:
    snapshot:
      name: blank-datavolume-snapshot
      namespace: default
  pvc:
    accessModes:
      - ReadWriteOnce
    resources:
      requests:
        storage: 2Gi
    storageClassName: ms-xfs-snapshot-restore

Observe intermediate PVC creation:

kubectl get pvc <uuid>-host-assisted-source-pvc -o yaml

Shows storage class ms-ext4-3-replicas-local instead of the specified ms-xfs-snapshot-restore

Additional context:

The current implementation in getStorageClassCorrespondingToSnapClass only matches on CSI driver and takes the first match:

func (r *SnapshotCloneReconciler) getStorageClassCorrespondingToSnapClass(driver string) (string, error) {
    matches := []storagev1.StorageClass{}
    // Lists all storage classes
    storageClasses := &storagev1.StorageClassList{}
    if err := r.client.List(context.TODO(), storageClasses); err != nil {
        return "", errors.New("unable to retrieve storage classes")
    }
    // Just matches on driver and takes first match
    for _, storageClass := range storageClasses.Items {
        if storageClass.Provisioner == driver {
            matches = append(matches, matches)
        }
    }
    if len(matches) > 1 {
        return matches[0].Name, nil
    }
    //...
}

Suggested Changes:

Prioritized Storage Class Selection:

  1. First: Use DV's specified storage class (if matches driver)
  2. Second: Try source PVC's storage class (from snapshot)
  3. Third: Try default storage class (with matching driver)
  4. Last Resort: Any matching storage class

or

Have a way for the user to configure behaviour for intermediate pvc

Environment:

  • CDI version (use kubectl get deployments cdi-deployment -o yaml):v1.60.3
  • Kubernetes version (use kubectl version):
    Server: v1.22.17
    Client: v1.31.0
  • DV specification:
apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
  name: new-datavolume-from-snapshot
  namespace: default
spec:
  source:
    snapshot:
      name: blank-datavolume-snapshot
      namespace: default
  pvc:
    accessModes:
      - ReadWriteOnce
    resources:
      requests:
        storage: 2Gi
    storageClassName: ms-xfs-snapshot-restore
  • Cloud provider or hardware configuration: N/A
  • OS (e.g. from /etc/os-release): N/A
  • Kernel (e.g. uname -a): N/A
  • Install tools:
  • Others:
    Multiple storage classes using same CSI driver io.openebs.csi-mayastor:
    • ms-ext4-3-replicas-local
    • ms-xfs-snapshot-restore
    • VolumeSnapshot feature enabled and CRDs installed
@akalenyu
Copy link
Collaborator

Hey, thanks for opening the issue!

2. Second: Try source PVC's storage class (from snapshot)

We've faced this concern when designing the feature. The issue is the source PVC isn't available when cloning a snapshot, otherwise we would ofc just grab the storage class from that.
But I am not against refining the logic around picking the storage class.

We've also had a working assumption that the same provisioner is able to restore from snapshot regardless of the storage class permutation, so, basically, that it could handle anything thrown at it just to get through the restore,
which I am seeing is not true in your case.

@TheRealSibasishBehera
Copy link
Author

We've faced this concern when designing the feature. The issue is the source PVC isn't available when cloning a snapshot, otherwise we would ofc just grab the storage class from that.

I totally missed that - my bad.

What do you think about just using DV.PVCSpec.StorageClass? This isn't the best solution because users might not be aware of any intermediate operation mechanisms like the requirements for host-assisted-pvc .

This issue could become common since we don't know what restrictions storage provisioners might have for their cloning/snapshotting. In my case, the storage class must be thin-provisioned.

Giving users more control would be beneficial. Perhaps similar to the CDI config status field scratchSpaceStorageClass, we could have snapshotRestoreConfig or similar section.

apiVersion: cdi.kubevirt.io/v1beta1
kind: CDIConfig
spec:
  snapshotRestoreConfig:
    storageClass: "thin-provisioned-sc"      # Single provisioner
    storageClassMap:                         # Multiple provisioners
      "provisioner-1": "thin-sc-1"
      "provisioner-2": "thin-sc-2"

Using an annotation like cdi.kubevirt.io/snapshot.restore.storageClass: "thin-sc", would be a simpler alternative

@akalenyu
Copy link
Collaborator

We've faced this concern when designing the feature. The issue is the source PVC isn't available when cloning a snapshot, otherwise we would ofc just grab the storage class from that.

I totally missed that - my bad.

What do you think about just using DV.PVCSpec.StorageClass? This isn't the best solution because users might not be aware of any intermediate operation mechanisms like the requirements for host-assisted-pvc .

This issue could become common since we don't know what restrictions storage provisioners might have for their cloning/snapshotting. In my case, the storage class must be thin-provisioned.

Giving users more control would be beneficial. Perhaps similar to the CDI config status field scratchSpaceStorageClass, we could have snapshotRestoreConfig or similar section.

apiVersion: cdi.kubevirt.io/v1beta1
kind: CDIConfig
spec:
  snapshotRestoreConfig:
    storageClass: "thin-provisioned-sc"      # Single provisioner
    storageClassMap:                         # Multiple provisioners
      "provisioner-1": "thin-sc-1"
      "provisioner-2": "thin-sc-2"

Using an annotation like cdi.kubevirt.io/snapshot.restore.storageClass: "thin-sc", would be a simpler alternative

I like the annotation solution more. Introducing more APIs for the host assisted fallback side of this is just too much bloat to have users understand. If we go with the annotation, we could take over for our own snapshots and mark them.
That would still unfortunately not be a complete solution since the storage class could be gone at that point..

I really hoped the storage provisioner could just do whatever it takes to restore. The data anyway gets copied later byte for byte over the network to the target PVC.

@TheRealSibasishBehera
Copy link
Author

Agreed, manual configuration isn't ideal for a user. It's a just bit frustrating to see that while direct PVC snapshot restores work, DataVolume(being a abstaction over PVC) snapshot restores fail.

We're constrained by the lack of standardization in how storage provisioners handle snapshot operations. Having a solution where we figure out what is the best storage class configuration and creating it, has many variables involved, and is a overkill for CDI to do.

The best workaround IMO, making sure storage class still exists at the point of creation of DataVolume might just be using a annotation on DV.

@akalenyu
Copy link
Collaborator

Agreed, manual configuration isn't ideal for a user. It's a just bit frustrating to see that while direct PVC snapshot restores work, DataVolume(being a abstaction over PVC) snapshot restores fail.

We're constrained by the lack of standardization in how storage provisioners handle snapshot operations. Having a solution where we figure out what is the best storage class configuration and creating it, has many variables involved, and is a overkill for CDI to do.

The best workaround IMO, making sure storage class still exists at the point of creation of DataVolume might just be using a annotation on DV.

I was thinking more of an annotation on the snapshot object, wdyt?

@TheRealSibasishBehera
Copy link
Author

both are very similar, but dataVolume annotation gives little more control at point of restore , because in case of annotation on volume snapshot, by the time the user try to make a dv the storage class might be deleted and the user would have to reapply the annotation finding a suitable storage class.

@TheRealSibasishBehera
Copy link
Author

@akalenyu I am happy to make a PR to close this issue if you agree with the approach. Let me know

@akalenyu
Copy link
Collaborator

Sounds good just want to poke at the use case some more. Why were you falling back to host assisted in your case?

@TheRealSibasishBehera
Copy link
Author

TheRealSibasishBehera commented Nov 21, 2024

CDI expects to use populator only when storage class is CSI storage class. For this CDI checks the CSIDriver object to determine that .

Smart clone is designed to be a subset of external populator feature . Please correct me if I am wrong on this.
And if this external populator annotation is false , it falls back to host-assisted one.

openebs mayastor doesn't have a CSIDriver object , and still supports snapshots . I am not sure if this is something CDI should solve because CSIDriver although not the best way to determine snapshot capability , is a kubernetes standard. I am not sure if it is being used for conformance though.

A approach to solve this might be not check external populator annotation but actually listing/creating a VolumeSnapshotClass to determine snapshot capability

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants