Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Add support for KVM_GET_XSAVE2 ioctls #261

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Add support for KVM_GET_XSAVE2 ioctls
Since Linux 5.17, the `kvm_xsave` struct has a flexible array member
(FAM) at the end, which can be retrieved using the `KVM_GET_XSAVE2`
ioctl [1]. What makes this FAM special is that the length is not stored
in the header, but has to be retrieved via
`KVM_CHECK_CAPABILITY(KVM_CAP_XSAVE2)`, which returns the total size of
the `kvm_xsave` struct (e.g. the traditional 4096 byte header + the size
of the FAM). Thus, to support `KVM_GET_XSAVE2`, we first need to check
the capability on the VM fd, construct a FamStructWrapper of the correct
size, and then call `KVM_GET_XSAVE2`.

[1]: https://www.kernel.org/doc/html/latest/virt/kvm/api.html#kvm-get-xsave2

Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
roypat committed Apr 24, 2024
commit 28b79530411ca55c2d48ae268faea356d41578db
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -4,6 +4,9 @@

### Added

- [[#261](https://github.com/rust-vmm/kvm-ioctls/pull/261)]: Add support
for `KVM_CAP_XSAVE2` and the `KVM_GET_XSAVE2` ioctl.

### Changed

## v0.17.0
2 changes: 2 additions & 0 deletions src/cap.rs
Original file line number Diff line number Diff line change
@@ -93,6 +93,8 @@ pub enum Cap {
#[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
Xsave = KVM_CAP_XSAVE,
#[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
Xsave2 = KVM_CAP_XSAVE2,
#[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
Xcrs = KVM_CAP_XCRS,
PpcGetPvinfo = KVM_CAP_PPC_GET_PVINFO,
PpcIrqLevel = KVM_CAP_PPC_IRQ_LEVEL,
48 changes: 48 additions & 0 deletions src/ioctls/vcpu.rs
Original file line number Diff line number Diff line change
@@ -853,6 +853,41 @@ impl VcpuFd {
Ok(xsave)
}

/// X86 specific call that returns the vcpu's current "xsave struct" via `KVM_GET_XSAVE2`.
///
/// See the documentation for `KVM_GET_XSAVE2` in the
/// [KVM API doc](https://www.kernel.org/doc/Documentation/virtual/kvm/api.txt).
///
/// # Arguments
///
/// * `vm_fd` - the file descriptor of the VM this [`Vcpu`] belongs to.
///
/// # Example
///
/// ```rust
/// # extern crate kvm_ioctls;
/// # use kvm_ioctls::Kvm;
/// let kvm = Kvm::new().unwrap();
/// let vm = kvm.create_vm().unwrap();
/// let vcpu = vm.create_vcpu(0).unwrap();
/// let xsave = vcpu.get_xsave2(&vm).unwrap();
/// ```
#[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
pub fn get_xsave2(&self, vm_fd: &crate::VmFd) -> Result<Xsave> {
let xsave_size = vm_fd.xsave_size();
Copy link
Contributor

@zulinx86 zulinx86 Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we really should get the required size internally.

VMM may check KVM caps beforehand (to fail faster). In that case, VMM can cache the XSAVE size somewhere and reuse it without calling KVM_CHECK_EXTENSIONS. For more flexibility of the VMM implementation, it could be better to keep xsave_size() out of get_xsave2().

Rather, not having xsave_size() in the first place might be better. In the first look, handling both cases (KVM_GET_XSAVE2 supported and not supported) looks nice. But a VMM supporting multiple kernel versions supporting KVM_GET_XSAVE and KVM_GET_XSAVE2 (like firecracker) should check the availability of the new API first any way, and then, if not available, it should fall back to KVM_GET_XSAVE. If we really want to absorb the differences of these two APIs, we should implement the fallback part internally.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mhh, yeah, I think I see what you mean. In the context of get_xsave2, having the fallback of xsave_size() to be 4096 when CAP_XSAVE2 isnt supported it fairly useless, because the KVM_GET_XSAVE2 call is gonna fail anyway. So it's kinda half-baked.

The problem API wise is that we cannot let the user allocate the Xsave struct, because if they allocate a too-small one, then get_xsave2 will result in undefined behavior potentially (e.g. the kernel writing past the allocated buffer). We could mark it unsafe, but that's also not very nice imo.

And yeah, I guess if all Xsave structs are created from within kvm-ioctls, then its not even useful to expose xsave_size() as an API altogether.

Copy link
Contributor

@zulinx86 zulinx86 Mar 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least for me, marking it unsafe isn't so bad. unsafe marked functions always make me (hopefully all Rustaceans) pay my highest attention to it. Of course, safe wrapper is the best, but as being discussed in another thread, it may be impossible to wrap it safely due to the possible race condition.

So my current humble suggestion is:

  • Mark get_xsave2() unsafe
  • Make get_xsave2() fail early if KVM_CAP_XSAVE2 isn't supported
  • Add lower wrappers like we do for check_extension()
    • Add get_xsave2_xsave() which allows users to pass &mut Xsave.
    • Add get_xsave2_raw() which allows users to pass a pointer to struct kvm_xsave.

let fam_size = xsave_size - std::mem::size_of::<kvm_xsave>();
let mut xsave = Xsave::new(fam_size).map_err(|_| errno::Error::new(libc::EINVAL))?;

// SAFETY: Here we trust the kernel not to read past the end of the kvm_xsave struct.
let ret = unsafe {
ioctl_with_mut_ref(self, KVM_GET_XSAVE2(), &mut xsave.as_mut_fam_struct().xsave)
};
if ret != 0 {
return Err(errno::Error::last());
}
Ok(xsave)
}

/// X86 specific call that sets the vcpu's current "xsave struct".
///
/// See the documentation for `KVM_SET_XSAVE` in the
@@ -884,6 +919,13 @@ impl VcpuFd {
Ok(())
}

/// Convenience function for doing `KVM_SET_XSAVE` with the FAM-enabled [`Xsave`]
/// instead of the pre-5.17 plain [`kvm_xsave`].
#[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
pub fn set_xsave2(&self, xsave: &Xsave) -> Result<()> {
self.set_xsave(&xsave.as_fam_struct_ref().xsave)
}

/// X86 specific call that returns the vcpu's current "xcrs".
///
/// See the documentation for `KVM_GET_XCRS` in the
@@ -2200,6 +2242,12 @@ mod tests {
vcpu.set_xsave(&xsave).unwrap();
let other_xsave = vcpu.get_xsave().unwrap();
assert_eq!(&xsave.region[..], &other_xsave.region[..]);
let xsave2 = vcpu.get_xsave2(&vm).unwrap();
assert_eq!(
&xsave.region[..],
&xsave2.as_fam_struct_ref().xsave.region[..]
);
vcpu.set_xsave2(&xsave2).unwrap();
}

#[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
14 changes: 14 additions & 0 deletions src/ioctls/vm.rs
Original file line number Diff line number Diff line change
@@ -1293,6 +1293,20 @@ impl VmFd {
self.run_size
}

/// Get the `kvm_xsave` size
#[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
pub fn xsave_size(&self) -> usize {
match self.check_extension_int(Cap::Xsave2) {
// If KVM does not support KVM_CAP_XSAVE2, then kvm_xsave will not
// have a FAM field, meaning the size of the struct is just the 4096 byte header array.
// Otherwise, KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2) will always return at least 4096,
// and describe the size of the header plus the FAM.
// See https://docs.kernel.org/virt/kvm/api.html#kvm-get-xsave2
..=0 => std::mem::size_of::<kvm_xsave>(),
size => size as usize,
}
}

/// Wrapper over `KVM_CHECK_EXTENSION`.
///
/// Returns 0 if the capability is not available and a positive integer otherwise.
3 changes: 3 additions & 0 deletions src/kvm_ioctls.rs
Original file line number Diff line number Diff line change
@@ -214,6 +214,9 @@ ioctl_iow_nr!(KVM_SET_DEBUGREGS, KVMIO, 0xa2, kvm_debugregs);
/* Available with KVM_CAP_XSAVE */
#[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
ioctl_ior_nr!(KVM_GET_XSAVE, KVMIO, 0xa4, kvm_xsave);
/* Available with KVM_CAP_XSAVE2 */
#[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
ioctl_ior_nr!(KVM_GET_XSAVE2, KVMIO, 0xcf, kvm_xsave);
/* Available with KVM_CAP_XSAVE */
#[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
ioctl_iow_nr!(KVM_SET_XSAVE, KVMIO, 0xa5, kvm_xsave);