Merged Drivers

From Open-IOV
Revision as of 00:22, 2 March 2023 by Arthur (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

The following page will provide specifications and details on the current state of host DRM + VFIO-Mdev drivers in support of various vendors.

In the context of this page the term "Merged Driver" refers to drivers which allow simultaneous acceleration of the host using a device's PF (Physical Function) and acceleration of guests using one or more VFs (Virtual Functions) created using the same device.

Merged functionality is currently supported for drivers which both make use of VFIO-Mdev and SR-IOV functionality depending on the vendor and driver implementation.

An absence of critical technical documentation has historically slowed growth and adoption of developer ecosystems for GPU virtualization.

This CC-BY-4.0 licensed content can either be used with attribution, or used as inspiration for new documentation, created by GPU vendors for public commercial distribution as developer documentation.

Where possible, this documentation will clearly label dates and versions of observed-but-not-guaranteed behaviour vs. vendor-documented stable interfaces/behaviour with guarantees of forward or backward compatibility.

Intel i915

Intel's slides mention the ability to accelerate up to '8 VMs plus DOM0'. Source: https://01.org/sites/default/files/documentation/an_introduction_to_intel_gvt-g_for_external.pdf

Intel currently supports host DRM and VFIO-Mdev/SR-IOV functionality in it's current i915 driver sources (VFIO-Mdev/GVT-g) and upstreaming i915 driver sources (VFIO-Mdev/SR-IOV).

Known Issues

  1. SR-IOV functionality is undocumented in the i915 driver API documentation.
    Confirmed affected versions: *
    A diagram depicting i915's shared host + VFIO-Mdev driver model.

Resolved Issues

  1. Multiplexing functionality requires use modified KVM and Xen hypervisors (KVMGT/XenGT).
    Confirmed affected versions:
    current i915 driver sources (GVT-g)
    Fixed in:
    upstreaming i915 driver sources (SR-IOV)

Nvidia

Known Issues

  1. Power management on laptops running mediated graphics functionality may causes graphical errors when not plugged in to AC power.
    Confirmed affected versions: 460.32.01, 460.73.04
    Possible mitigation: lore.kernel.org: "vfio/pci: Change the PF power state to D0 before enabling VFs"
  2. VFIO-vmalloc errors may occur as a result of page collisions between host & guest on GPUs with smaller VRAM frame buffer sizes.
    Confirmed affected versions: 460.32.01, 460.73.04
  3. Mdev service daemons may crash or load incorrectly requiring a service restart or reboot during host runtime.
    Confirmed affected versions: 460.32.01, 460.73.04, 510.xx.xx
  4. Guest drivers fail to initialize correctly when VFIO-Mdev devices are mixed with some VFIO passthrough'd USB hubs.
    Confirmed affected versions: 460.73.01

Resolved Issues

  1. Upon executing QEMU mdev device initialization for a second Mdev an IOMMU group binding error occurs in QEMU preventing the device from being brought up.
    Confirmed affected versions: 460.32.01, 460.73.04
    Fixed in: 510.xx.xx

Module Configuration

Depending on your use-case the use of some additional parameters when booting the system may be helpful.

Here is a list of some parameters which may be used when loading the module via GRUB or Systemd-boot.

Module Parameters
Parameter Description Side-Effects
cudahost=1 Allows use of CUDA on the host system. Windows guests may fail.
nvidia.vgpukvm=0 Disables GPU virtualization on the host.

AMDGPU

At this time AMDGPU does not currently support VFIO-Mdev functionality. It may be possible to incorporate Mdev Mode mediated device support similar to those functions in nvidia.ko and i915.ko in the Linux kernel's AMDGPU driver sources to produce a driver suitable for merged host+guest DRM for use with AMD GPU devices.

Known Issues

  1. Host DRM does not work alongside guest VFIO-Mdev.
    Confirmed affected versions: *
  2. The amdgpu kernel module doesn't contain hooks for guest signalling via irqfd & ioeventfd used for VFIO-mdev callbacks.
    Confirmed affected versions: *