Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ISSUE-1051] Handle Kubelet's Wrong CSI Call Inconsistent with Real Volume Status for OBS Release 1.3 #1053

Open
wants to merge 18 commits into
base: rel/objectscale-1.3
Choose a base branch
from

Conversation

CraneShiEMC
Copy link
Collaborator

@CraneShiEMC CraneShiEMC commented Aug 27, 2023

Purpose

Resolves #1051

  1. Handle Kubelet's Wrong CSI Call Inconsistent with Real Volume Status
  2. Proceed Kubelet's CSI call also on Failed Volume

PR checklist

  • Add link to the issue
  • Choose Project
  • Choose PR label
  • New unit tests added
  • Modified code has meaningful comments
  • All TODOs are linked with the issues
  • All comments are resolved

Testing

I've similuated the scenario of volume's k8s global device mountpoint missing in my standalone test. This CSI defensive enhancement can work well as expected in the test.

custom ci passed: https://asd-ecs-jenkins.isus.emc.com/job/csi-custom-ci/1562/

custom-acceptance passed:
Atlantic (rke2): https://asd-ecs-jenkins.isus.emc.com/job/csi-custom-acceptance-tar_b_ona/39/
Openshift: https://asd-ecs-jenkins.isus.emc.com/job/csi-custom-acceptance-oil_bd/255/

yimingwangdell and others added 17 commits July 18, 2023 14:10
* add StorageGroupStatus

Signed-off-by: Shi, Crane <[email protected]>

* support StorageGroupStatus in current workflows

Signed-off-by: Shi, Crane <[email protected]>

* refine log

Signed-off-by: Shi, Crane <[email protected]>

* refine error handling

Signed-off-by: Shi, Crane <[email protected]>

* trigger storage group resync if applicable in drive removal

Signed-off-by: Shi, Crane <[email protected]>

* A drive whose Usage is REMOVED will not be selected in any storage group and its existing sg label takes no effect

Signed-off-by: Shi, Crane <[email protected]>

* sg feature will not apply to drive physically removed

Signed-off-by: Shi, Crane <[email protected]>

* handle the drive removal case of drive with manual sg label

Signed-off-by: Shi, Crane <[email protected]>

* fix go lint error

Signed-off-by: Shi, Crane <[email protected]>

* add UT case for drive-removal-triggered sg sync

Signed-off-by: Shi, Crane <[email protected]>

* improve UT coverage

Signed-off-by: Shi, Crane <[email protected]>

* refine sg annotation for drive removal

Signed-off-by: Shi, Crane <[email protected]>

* handle case of invalid sg for drive removal

Signed-off-by: Shi, Crane <[email protected]>

* also exclude removing sg for trigger sg resync in drive removal

Signed-off-by: Shi, Crane <[email protected]>

* refine sg removal status handling

Signed-off-by: Shi, Crane <[email protected]>

* Revert "refine error handling"

This reverts commit 06607e7.

Signed-off-by: Shi, Crane <[email protected]>

* refine log and some code logic

Signed-off-by: Shi, Crane <[email protected]>

* try to add immutability validation rule to storagegroup spec

Signed-off-by: Shi, Crane <[email protected]>

* upgrade controller-gen version to v0.9.2

Signed-off-by: Shi, Crane <[email protected]>

* add storagegroupcontroller UT initial

Signed-off-by: Shi, Crane <[email protected]>

* Revert "add storagegroupcontroller UT initial"

This reverts commit 1ea8660.

Signed-off-by: Shi, Crane <[email protected]>

* add storagegroupcontroller UT

Signed-off-by: Shi, Crane <[email protected]>

* fix

Signed-off-by: Shi, Crane <[email protected]>

* add storagegroupcontroller UT

Signed-off-by: Shi, Crane <[email protected]>

* refactor and add UT of storagegroupcontroller

Signed-off-by: Shi, Crane <[email protected]>

* add storagegroupcontroller UT

Signed-off-by: Shi, Crane <[email protected]>

* fix storagegroupcontroller UT

Signed-off-by: Shi, Crane <[email protected]>

* add storagegroupcontroller UT

Signed-off-by: Shi, Crane <[email protected]>

* refine the logic of sg deletion

Signed-off-by: Shi, Crane <[email protected]>

* refine

Signed-off-by: Shi, Crane <[email protected]>

* fix bug

Signed-off-by: Shi, Crane <[email protected]>

* fix go-lint err

Signed-off-by: Shi, Crane <[email protected]>

* fix go-lint error

Signed-off-by: Shi, Crane <[email protected]>

* add drive IsClean support, decrease k8s api call, remove manual sg labeling support

Signed-off-by: Shi, Crane <[email protected]>

* fix

Signed-off-by: Shi, Crane <[email protected]>

* fix

Signed-off-by: Shi, Crane <[email protected]>

* fix UT

Signed-off-by: Shi, Crane <[email protected]>

* refine corner case handling

Signed-off-by: Shi, Crane <[email protected]>

* fix

Signed-off-by: Shi, Crane <[email protected]>

* refine and add UT to storagegroupcontroller

Signed-off-by: Shi, Crane <[email protected]>

* refine storagegroupcontroller and add UT

Signed-off-by: Shi, Crane <[email protected]>

* make controller svc's k8scache also sync sg and lvg objs'

Signed-off-by: Shi, Crane <[email protected]>

* use k8s cache, re-support sg label manual change and refine in sg ctrl

Signed-off-by: Shi, Crane <[email protected]>

* fix lint err

Signed-off-by: Shi, Crane <[email protected]>

* add storagegroupcontroller UT

Signed-off-by: Shi, Crane <[email protected]>

* add storagegroupcontroller UT

Signed-off-by: Shi, Crane <[email protected]>

* storagegroup controller will not reconcile on drive delete event

Signed-off-by: Shi, Crane <[email protected]>

* not support Health, Status, Usage and IsClean as DriveSelector's MatchFields

Signed-off-by: Shi, Crane <[email protected]>

* in storagegroupcontroller's reconcile, only sync drive when reqName is uuid

Signed-off-by: Shi, Crane <[email protected]>

* refine the logic to avoid nil pointer error

Signed-off-by: Shi, Crane <[email protected]>

* revert the usage of k8scache

Signed-off-by: Shi, Crane <[email protected]>

* add custom storage group proposal draft

Signed-off-by: Shi, Crane <[email protected]>

* refine custom storage group proposal

Signed-off-by: Shi, Crane <[email protected]>

---------

Signed-off-by: Shi, Crane <[email protected]>
* fix pr validation startup failure

Signed-off-by: Shi, Crane <[email protected]>

* directly use generated file for fake attach block mode

Signed-off-by: Shi, Crane <[email protected]>

* still use loopback device wrap

Signed-off-by: Shi, Crane <[email protected]>

* refine code for creating fake device

Signed-off-by: Shi, Crane <[email protected]>

* fix go lint error

Signed-off-by: Shi, Crane <[email protected]>

* add mock func implementation

Signed-off-by: Shi, Crane <[email protected]>

* fix go lint

Signed-off-by: Shi, Crane <[email protected]>

* fix UT

Signed-off-by: Shi, Crane <[email protected]>

* refine func

Signed-off-by: Shi, Crane <[email protected]>

* support fake attach block mode with fake device, add removeLoopDevice,

Signed-off-by: Shi, Crane <[email protected]>

* fix go lint

Signed-off-by: Shi, Crane <[email protected]>

* refine

Signed-off-by: Shi, Crane <[email protected]>

* support clean fake device in fake-attach block-mode

Signed-off-by: Shi, Crane <[email protected]>

* support non-existing current fake device case

Signed-off-by: Shi, Crane <[email protected]>

* change fake device dir on host

Signed-off-by: Shi, Crane <[email protected]>

* refine log

Signed-off-by: Shi, Crane <[email protected]>

* support the case of get fake device info failure

Signed-off-by: Shi, Crane <[email protected]>

* clean fake device also in removal of fake-attach block-mode vol

Signed-off-by: Shi, Crane <[email protected]>

* if fake-device ann is invalid, re-create the fake device and update ann

Signed-off-by: Shi, Crane <[email protected]>

* fix

Signed-off-by: Shi, Crane <[email protected]>

* enhance

Signed-off-by: Shi, Crane <[email protected]>

* refine

Signed-off-by: Shi, Crane <[email protected]>

* refine

Signed-off-by: Shi, Crane <[email protected]>

* add comment

Signed-off-by: Shi, Crane <[email protected]>

* add UT

Signed-off-by: Shi, Crane <[email protected]>

* add UT

Signed-off-by: Shi, Crane <[email protected]>

* add UT

Signed-off-by: Shi, Crane <[email protected]>

* add UT

Signed-off-by: Shi, Crane <[email protected]>

* check loop device err shouldn't block subsequent op; clean fake device should also check loop device first

Signed-off-by: Shi, Crane <[email protected]>

* update fake-attach doc accordingly

Signed-off-by: Shi, Crane <[email protected]>

* fix typo

Signed-off-by: Shi, Crane <[email protected]>

* refine doc

Signed-off-by: Shi, Crane <[email protected]>

---------

Signed-off-by: Shi, Crane <[email protected]>
Signed-off-by: Shi, Crane <[email protected]>
Signed-off-by: Shi, Crane <[email protected]>
Signed-off-by: Shi, Crane <[email protected]>
Signed-off-by: Shi, Crane <[email protected]>
Signed-off-by: Shi, Crane <[email protected]>
…new release. (#1039)"

This reverts commit b5ffc6c.

Signed-off-by: Shi, Crane <[email protected]>
…emetal into bugfix-handle-kubelet-wrong-call-obs-1.3

Signed-off-by: Shi, Crane <[email protected]>
Copy link
Collaborator

@libzhang libzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please fix UT failure in PR validation before merge.

@CraneShiEMC
Copy link
Collaborator Author

please fix UT failure in PR validation before merge.

OK

Signed-off-by: Shi, Crane <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants