CDI importer converting image which is already RAW #3457

waldner · 2024-10-11T13:07:32Z

What happened:

When creating a DV using an HTTP URL as source to import an image in RAW format, the importer runs a conversion pass on it. The relevant log lines:

...
I1011 13:00:14.111593       1 prometheus.go:78] 99.40
I1011 13:00:15.111696       1 prometheus.go:78] 99.80
I1011 13:00:16.111891       1 prometheus.go:78] 100.00
I1011 13:00:16.638904       1 data-processor.go:247] New phase: Convert
I1011 13:00:16.638960       1 data-processor.go:253] Validating image
E1011 13:00:16.655763       1 prlimit.go:156] failed to kill the process; os: process already finished
I1011 13:00:16.655975       1 qemu.go:354] Adding preallocation method: [-o preallocation=falloc]
I1011 13:00:16.656036       1 qemu.go:358] Attempting preallocation method, qemu-img args: [convert -o preallocation=falloc -t writeback -p -O raw /scratch/tmpimage /data/disk.img]
I1011 13:00:16.668451       1 qemu.go:273] 0.00
I1011 13:00:18.530061       1 qemu.go:273] 1.00
...

As said, the image is already RAW; the import over http is already extremely slow despite the http server being located on the same lan; it'd be better not to add yet even more extra time for a no-op conversion to the VM start time.

What you expected to happen:

The mage should be used as-is, or there should be a way to tell the importer to skip the (no-op) conversion.

Additional context:

Here's the DV template from the VM definition:

    dataVolumeTemplates:
    - metadata:
        creationTimestamp: null
        name: myvm-cdisk
      spec:
        preallocation: true
        pvc:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 50Gi
        source:
          http:
            url: http://10.11.12.10:9999/assets/rawimg.img

Environment:

CDI version (use kubectl get deployments cdi-deployment -o yaml): v1.60.3
Kubernetes version (use kubectl version): 1.31.1
DV specification: see above

The text was updated successfully, but these errors were encountered:

awels · 2024-10-11T13:25:27Z

If the source is already RAW, then the conversion is essentially a no-op, and it is just copying the data.

waldner · 2024-10-11T13:26:23Z

Sure, but it still wastes time before the VM can start. Since the initial image is rather large, the (already extremely slow) process of importing it becomes even slower.
Also I might be misremembering, but I think this was not happening in older versions.

waldner · 2024-10-14T15:26:51Z

In fact, why is a conversion needed at all? Qemu and libvirt can work with different image types, so even if the imported image were (say) QCOW2 it could still be used as-is.
What is the rationale behind the conversion process? I think the user should be given more control over it.

awels · 2024-10-14T19:05:39Z

So we do the conversion because we want all the images to be raw. This allows us to directly use built in kubernetes APIs and worry about different formats. This also allows us to use the built in capabilities of the kubernets platform (snapshots, etc). We could use qcow2 format most likely, but now we need a different approach when we want to use that with block devices. It just makes everything more complex. This approach has served us really well and we are unlikely to diverge.

waldner · 2024-10-14T19:44:22Z

thanks for the explanation; still, I maintain that a raw-to-raw conversion is unneeded and can delay the VM startup by a few minutes if the image is big.
I don't know the storage details, but it might also waste space if both images need to be stored at the same time, at least for the duration of the conversion (I might be wrong on this last one though).

awels · 2024-10-14T20:03:36Z

I agree with you, I just double checked the code as I was like, there is no reason to not write directly to the target once we determine this is a raw file. But in the code we write to scratch space first and then write to the target. I have to go look up the history of why it is like this. There might be some validation on sizes and stuff that we can only get from writing to scratch space first and then copying the data.

aglitke · 2024-10-21T12:24:54Z

One reason we do the conversion unconditionally is so that we can respect the sparseness of the image on the target PVC. In the past we did not have a sparse writer and the so the only way to do this was with explicit conversion. Now @mhenriks has provided a sparse writer so it could be possible to write directly to the target while respecting sparseness. Alvaro is going to take a look and see if this can be done...

/assign @alromeros

waldner · 2024-10-21T22:14:43Z

Thanks. Either way, I would give the user more power to decide, if they want to trade optimizations for VM startup speed, they should be able to do it.

waldner added the kind/bug label Oct 11, 2024

kubevirt-bot assigned alromeros Oct 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CDI importer converting image which is already RAW #3457

CDI importer converting image which is already RAW #3457

waldner commented Oct 11, 2024 •

edited

Loading

awels commented Oct 11, 2024

waldner commented Oct 11, 2024 •

edited

Loading

waldner commented Oct 14, 2024

awels commented Oct 14, 2024

waldner commented Oct 14, 2024 •

edited

Loading

awels commented Oct 14, 2024

aglitke commented Oct 21, 2024

waldner commented Oct 21, 2024

CDI importer converting image which is already RAW #3457

CDI importer converting image which is already RAW #3457

Comments

waldner commented Oct 11, 2024 • edited Loading

awels commented Oct 11, 2024

waldner commented Oct 11, 2024 • edited Loading

waldner commented Oct 14, 2024

awels commented Oct 14, 2024

waldner commented Oct 14, 2024 • edited Loading

awels commented Oct 14, 2024

aglitke commented Oct 21, 2024

waldner commented Oct 21, 2024

waldner commented Oct 11, 2024 •

edited

Loading

waldner commented Oct 11, 2024 •

edited

Loading

waldner commented Oct 14, 2024 •

edited

Loading