Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VM Image Controller, Webhook and Download/Upload Refactor for Various Backend Support #7170

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

WebberHuang1118
Copy link
Member

@WebberHuang1118 WebberHuang1118 commented Dec 16, 2024

Code Structure:

  • Backend interfaces:
    • pkg/image/backend
  • Common VM Image operation package
    • pkg/image/common
  • Backing image implementation
    • pkg/image/backingimage
  • CDI implementation skeleton
    • pkg/image/cdi

Problem:
VM Image should support various source backend

Solution:
Enhance VM Image implementation to have a clear interface for different image backend

Related Issue:
#6936

HEP:
#7273

Test plan:

  • Prerequisites
    • Building Harvester Cluster with the master-head branch ( 0cea310be669040bc682c75edd40fd6f81051bce in my env)
    • Updating VirtualMachineImage CRD
      • $ kubectl apply -f deploy/charts/harvester-crd/templates/harvesterhci.io_virtualmachineimages.yaml
    • Updating harvester deployment with this PR
    • Updating harvester-webhook deployment with this PR
  • Case 1: Creating VM image with upload
    • Uploading an OS image file from local
    • Checking the VM image could progress to the completed status
    • Creating a VM with the VM image
  • Case 2: Creating VM image with download
    • Downloading an OS image file from a certain URL
    • Checking the VM image could progress to the completed status
    • Creating a VM with the VM image
  • Case 3: Creating VM image with export
    • Creating a volume with an existing OS VM Image
    • Exporting the volume as a new image
    • Checking the exported VM image could progress to the completed status
    • Creating a VM with the VM image
  • Case 4: VM Image encryption
    • Creating a secret and sc for encryption
    • Creating an encrypted VM image from an existing VM image (non-encrypted)
    • Checking the encrypted VM image could progress to the completed status
    • Creating a VM with the VM image
  • Case 5: VM Image decryption
    • Continuing from the above steps
    • Creating a VM image by decrypting the encrypted VM image
    • Checking the decrypted VM image could progress to the completed status
    • Creating a VM with the VM image
  • Case 6: VM Image Downloading
    • Triggering downloading an existing VM image
    • The VM image should downloaded as a compressed file

** Additional Context **
Since the VM image implementation has the new interfaces, the original unit testing is not applicable and removed. Creating a separate ticket to track the unit testing implementation #7204

Copy link
Member

@brandboat brandboat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great refactor!

pkg/image/backingimage/upload.go Outdated Show resolved Hide resolved
pkg/image/common/operator.go Outdated Show resolved Hide resolved
pkg/image/backingimage/upload.go Outdated Show resolved Hide resolved
Copy link
Contributor

@Yu-Jack Yu-Jack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice try! There is a suggestion for folder structure:


I think our folder structure should match what it's , and backend might be a confused naming.

For example, you could move pkg/image/backend/backend.go to pkg/master/controller/image/interface.go, and rename the interface name like CSI or CSIInterface etc.

Then we could use import pkg/master/controller/image, and use image.CSIInterface to present the meaning instead of backend.Backend.

The pkg/image/backingimage/backing_image_controller.go should be in pkg/master/controller. Spreading the controller on different places might not match our current folder structure, including mutator and validator.

But, I guess that you think putting them on same folder is more easier for finding and looking , So, there is another way to organize them. Create a new folder to present our core (real business) logic. That means we could put similar feature into this folder in the future if we need.

core
└── image
    ├── backingimage
    │   ├── backing_image_controller.go
    │   ├── backingimage.go
    │   ├── download.go
    │   ├── mutate.go
    │   ├── upload.go
    │   └── validate.go
    ├── cdi
    │   ├── cdi.go
    │   ├── download.go
    │   ├── mutate.go
    │   ├── upload.go
    │   └── validate.go
    ├── common
    │   ├── mutator.go
    │   ├── operator.go
    │   └── validator.god
    └── interface.go

However, I still think backing_image_controller.go should be master/controller, and keep mutator.go and validator.go in original folder. We can wait for other people's feedbacks.

@WebberHuang1118
Copy link
Member Author

Thanks for the feedbacks @Yu-Jack

Nice try! There is a suggestion for folder structure:

I think our folder structure should match what it's , and backend might be a confused naming.

For example, you could move pkg/image/backend/backend.go to pkg/master/controller/image/interface.go, and rename the interface name like CSI or CSIInterface etc.

Then we could use import pkg/master/controller/image, and use image.CSIInterface to present the meaning instead of backend.Backend.

It's a good idea, I will update it.

The pkg/image/backingimage/backing_image_controller.go should be in pkg/master/controller. Spreading the controller on different places might not match our current folder structure, including mutator and validator.

Because pkg/image/backingimage/backing_image_controller.go implements the sidecar handler for LH image, which is specific to LH backingimage, Harvester may support all the image features by CDI someday. IMO, placing the sidecar handler implementation at the general path pkg/master/controller seems not suitable.

But, I guess that you think putting them on same folder is more easier for finding and looking , So, there is another way to organize them. Create a new folder to present our core (real business) logic. That means we could put similar feature into this folder in the future if we need.

core
└── image
    ├── backingimage
    │   ├── backing_image_controller.go
    │   ├── backingimage.go
    │   ├── download.go
    │   ├── mutate.go
    │   ├── upload.go
    │   └── validate.go
    ├── cdi
    │   ├── cdi.go
    │   ├── download.go
    │   ├── mutate.go
    │   ├── upload.go
    │   └── validate.go
    ├── common
    │   ├── mutator.go
    │   ├── operator.go
    │   └── validator.god
    └── interface.go

I will consider it, thanks.

However, I still think backing_image_controller.go should be master/controller, and keep mutator.go and validator.go in original folder. We can wait for other people's feedbacks.

The VM image mutator and validator are still provided by the original path pkg/webhook/resources/virtualmachineimage/mutator.go and pkg/webhook/resources/virtualmachineimage/validator.go.

pkg/image/backingimage/mutate.go and pkg/image/backingimage/validate.go aim to provide the common library for VM Image mutation and validation.

@Yu-Jack
Copy link
Contributor

Yu-Jack commented Jan 6, 2025

Because pkg/image/backingimage/backing_image_controller.go implements the sidecar handler for LH image, which is specific to LH backingimage, Harvester may support all the image features by CDI someday. IMO, placing the sidecar handler implementation at the general path pkg/master/controller seems not suitable.

You could organize like this. It's just an example.

controller/master/image
├── register.go
├── third-party
│   ├── backingimage
│   │   ├── backing_image_controller.go
│   │   ├── backingimage.go
│   │   ├── download.go
│   │   ├── mutate.go
│   │   ├── upload.go
│   │   └── validate.go
│   └── cdi
│       ├── cdi.go
│       ├── download.go
│       ├── mutate.go
│       ├── upload.go
│       └── validate.go
└── vm_image_controller.go

The VM image mutator and validator are still provided by the original path pkg/webhook/resources/virtualmachineimage/mutator.go and pkg/webhook/resources/virtualmachineimage/validator.go.

pkg/image/backingimage/mutate.go and pkg/image/backingimage/validate.go aim to provide the common library for VM Image mutation and validation.

Yeah, I also treat pkg/webhook/resources/virtualmachineimage/mutator.go and pkg/webhook/resources/virtualmachineimage/validator.go as a wrapper to wrap the core business logic. But, you could still do similar things like above to organize them if we're willing to do so. The point I'd like to highlight is the folder structure. Should we put all related business logic into same folder or organize by validator, mutator or controller like this? I just want to provide different perspectives to this topic.

webhook/resources/virtualmachineimage
├── mutator.go
├── third-party
│   ├── backingimage
│   │   ├── mutate.go
│   │   └── validate.go
│   └── cdi
│       ├── cdi.go
│       ├── mutate.go
│       └── validate.go
└── validator.go

controller/master/image
├── register.go
├── third-party
│   ├── backingimage
│   │   ├── backing_image_controller.go
│   │   └── backingimage.go
│   └── cdi
│       └── cdi.go
└── vm_image_controller.go

api/image
├── formatter.go
├── schema.go
└── third-party
    ├── backingimage
    │   ├── download.go
    │   └── upload.go
    └── cdi
        ├── cdi.go
        ├── download.go
        └── upload.go

@WebberHuang1118
Copy link
Member Author

WebberHuang1118 commented Jan 6, 2025

Because pkg/image/backingimage/backing_image_controller.go implements the sidecar handler for LH image, which is specific to LH backingimage, Harvester may support all the image features by CDI someday. IMO, placing the sidecar handler implementation at the general path pkg/master/controller seems not suitable.

You could organize like this. It's just an example.

controller/master/image
├── register.go
├── third-party
│   ├── backingimage
│   │   ├── backing_image_controller.go
│   │   ├── backingimage.go
│   │   ├── download.go
│   │   ├── mutate.go
│   │   ├── upload.go
│   │   └── validate.go
│   └── cdi
│       ├── cdi.go
│       ├── download.go
│       ├── mutate.go
│       ├── upload.go
│       └── validate.go
└── vm_image_controller.go

The VM image mutator and validator are still provided by the original path pkg/webhook/resources/virtualmachineimage/mutator.go and pkg/webhook/resources/virtualmachineimage/validator.go.
pkg/image/backingimage/mutate.go and pkg/image/backingimage/validate.go aim to provide the common library for VM Image mutation and validation.

Yeah, I also treat pkg/webhook/resources/virtualmachineimage/mutator.go and pkg/webhook/resources/virtualmachineimage/validator.go as a wrapper to wrap the core business logic. But, you could still do similar things like above to organize them if we're willing to do so. The point I'd like to highlight is the folder structure. Should we put all related business logic into same folder or organize by validator, mutator or controller like this? I just want to provide different perspectives to this topic.

webhook/resources/virtualmachineimage
├── mutator.go
├── third-party
│   ├── backingimage
│   │   ├── mutate.go
│   │   └── validate.go
│   └── cdi
│       ├── cdi.go
│       ├── mutate.go
│       └── validate.go
└── validator.go

controller/master/image
├── register.go
├── third-party
│   ├── backingimage
│   │   ├── backing_image_controller.go
│   │   └── backingimage.go
│   └── cdi
│       └── cdi.go
└── vm_image_controller.go

api/image
├── formatter.go
├── schema.go
└── third-party
    ├── backingimage
    │   ├── download.go
    │   └── upload.go
    └── cdi
        ├── cdi.go
        ├── download.go
        └── upload.go

My idea is if anyone wants to add another image solution (e.q. LH v2 image), we should make the task as easy as possible, so I encapsulate the details for VM Image, and it's provided as an interface/library.

By the current directory structure:

pkg/image/
├── backend
│   └── backend.go
├── backingimage
│   ├── backing_image_controller.go
│   ├── backingimage.go
│   ├── download.go
│   ├── mutate.go
│   ├── upload.go
│   └── validate.go
├── cdi
│   ├── cdi.go
│   ├── download.go
│   ├── mutate.go
│   ├── upload.go
│   └── validate.go
└── common
    ├── mutator.go
    ├── operator.go
    └── validator.go

If there is a requirement for new image solution, ideally, it only needs to add implementations under pkg/image. Compared to the structure:

webhook/resources/virtualmachineimage
├── mutator.go
├── third-party
│   ├── backingimage
│   │   ├── mutate.go
│   │   └── validate.go
│   └── cdi
│       ├── cdi.go
│       ├── mutate.go
│       └── validate.go
└── validator.go

controller/master/image
├── register.go
├── third-party
│   ├── backingimage
│   │   ├── backing_image_controller.go
│   │   └── backingimage.go
│   └── cdi
│       └── cdi.go
└── vm_image_controller.go

api/image
├── formatter.go
├── schema.go
└── third-party
    ├── backingimage
    │   ├── download.go
    │   └── upload.go
    └── cdi
        ├── cdi.go
        ├── download.go
        └── upload.go

the new implementation will be spreaded among different places.

Should we put all related business logic into same folder or organize by validator, mutator or controller like this? I just want to provide different perspectives to this topic.

IMO, the business logic is still controlled by the original way, I mean:

  • Controller: pkg/controller/master/image/vm_image_controller.go
  • Webhook: pkg/webhook/resources/virtualmachineimage/mutator.go and pkg/webhook/resources/virtualmachineimage/validator.go
  • API handler: pkg/api/image/formatter.go

Source codes under pkg/image are provided for the implementation of different image solutions, and are passively invoked by the original control path.

I can understand your concern, I think we can wait for other colleagues' feedback @bk201 @brandboat @Vicente-Cheng @FrankYang0529

@brandboat
Copy link
Member

I believe what Jack is trying to highlight is the importance of consistency. It seems that all modules currently follow the same pattern, with mutators and validators placed under the webhook directory, and controllers under the controller directory. Personally, I lean towards keeping things as they are, which means sticking to this pattern.

However, the drawback is clear— as you mentioned, the backing image code is scattered across multiple locations. I believe this is a separate issue and might be worth opening a new discussion or issue. Additionally, as we plan to support third-party storage in more features, I believe this problem may not be limited to VM images and could affect other areas (like vm backup) as well.

@WebberHuang1118
Copy link
Member Author

WebberHuang1118 commented Jan 10, 2025

I believe what Jack is trying to highlight is the importance of consistency. It seems that all modules currently follow the same pattern, with mutators and validators placed under the webhook directory, and controllers under the controller directory. Personally, I lean towards keeping things as they are, which means sticking to this pattern.

However, the drawback is clear— as you mentioned, the backing image code is scattered across multiple locations. I believe this is a separate issue and might be worth opening a new discussion or issue. Additionally, as we plan to support third-party storage in more features, I believe this problem may not be limited to VM images and could affect other areas (like vm backup) as well.

Hi @brandboat,

Thanks for your feedback, however, I want to point out, that the pattern structure has not changed:

  • Controller: pkg/controller/master/image/vm_image_controller.go
  • Webhook: pkg/webhook/resources/virtualmachineimage/mutator.go and pkg/webhook/resources/virtualmachineimage/validator.go
  • API handler: pkg/api/image/formatter.go

the control paths for the controller, mutator, validator, and API handlers are organized as before, they just access an abstracted layer/interface/implementation from pkg/image/, which is passively invoked.

IMO, if anyone wants to add a new image solution, he doesn't need to care about the control flow for the controller, webhook, API handler, and any details about CRD VMImage (VMImage operations are encapsulated in pkg/image/common), he just needs to focus on the interfaces exported in pkg/image/. This seems simpler to me.

@brandboat
Copy link
Member

the control paths for the controller, mutator, validator, and API handlers are organized as before, they just access an abstracted layer/interface/implementation from pkg/image/, which is passively invoked.

IMO, if anyone wants to add a new image solution, he doesn't need to care about the control flow for the controller, webhook, API handler, and any details about CRD VMImage, he just needs to focus on the interfaces exported in pkg/image/. This seems simpler for me.

OK, that makes sense to me. Thanks for the clarification!

pkg/controller/master/image/vm_image_controller.go Outdated Show resolved Hide resolved
Comment on lines +101 to +105
func (vmio *vmiOperator) GetName(vmi *harvesterv1.VirtualMachineImage) string {
return vmi.Name
}

func (vmio *vmiOperator) GetNamespace(vmi *harvesterv1.VirtualMachineImage) string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be just my habit of taste, but I feel having a bunch of getters here is a bit overkill. Or do we need this for purpose?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, if anyone wants to add a new image solution, he doesn't need to care about the control flow for the controller, webhook, API handler, and any details about CRD VMImage (VMImage operations are encapsulated in pkg/image/common), he just needs to focus on the interfaces exported in pkg/image/. This seems simpler to me.

As mentioned, I'd like to encapsulate the VM Image's details for any image solution (including backing image and third-party), the image implementation uses the provided libraries to operate VM Images (rather than directly manipulate the object).

That gives the integration a clear interface/concept and consistency to me. WDYT? @brandboat

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that VirtualMachineImage is similar to a DTO, and we can directly access elements like vmi.name or vmi.namespace. If there’s no additional logic in these getter methods, I’m wondering if there is much benefit in using vmio.GetName(vmi). This leaves me a bit uncertain about which approach to choose: directly invoking the DTO field (e.g., vmi.Name) or using the GetName method? I’ve noticed that some places use getters while others don’t, and I’m curious if there’s a particular reason or preference for one approach over the other.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick question, are there any other objects that need to implement the VMIOperator interface? Since we treat the VirtualMachineImage as a simple object to accept different parameters to connect with BackingImage and CDI, do we still need the VMIOperator interface? From my perspective, maybe we don't need to?

Copy link
Member Author

@WebberHuang1118 WebberHuang1118 Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that VirtualMachineImage is similar to a DTO, and we can directly access elements like vmi.name or vmi.namespace. If there’s no additional logic in these getter methods, I’m wondering if there is much benefit in using vmio.GetName(vmi). This leaves me a bit uncertain about which approach to choose: directly invoking the DTO field (e.g., vmi.Name) or using the GetName method? I’ve noticed that some places use getters while others don’t, and I’m curious if there’s a particular reason or preference for one approach over the other.

A quick question, are there any other objects that need to implement the VMIOperator interface? Since we treat the VirtualMachineImage as a simple object to accept different parameters to connect with BackingImage and CDI, do we still need the VMIOperator interface? From my perspective, maybe we don't need to?

From the original design and my opinion, for any image solution integration in pkg/image (e.q. pkg/image/backingimage and pkg/image/cdi), it should just focus on implementing interfaces exported at pkg/image/backend.

So in such a point of view, the particular image solution implementation/integration should not directly manipulate the resource VM Image. More generally speaking, it shouldn't have to care about any details in the resource VM image, instead, it only has to know the methods/interfaces (at pkg/image/common) that are provided to operate VM Images.

Copy link
Member

@brandboat brandboat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've skimmed through the code, and overall seems good to me, only a few nits left. I'll test basic VMI operations later. Thanks for the PR!

pkg/webhook/resources/virtualmachineimage/mutator.go Outdated Show resolved Hide resolved
Copy link
Member

@brandboat brandboat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Test below cases and all passed.

  • Create image through URL download
  • Create image through upload
  • Failed to create image by using invalid URL and check image retry 3 times, error message pop up correctly.
  • Download image from existing image

…d for various backend support

Signed-off-by: Webber Huang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants