GPU Driver Internals

From Open-IOV
Jump to navigation Jump to search

This page will detail the internals of various GPU drivers for use with I/O Virtualization.

i915

High Level Architecture

Initialization

During initialization of the i915 driver the GuC binary blob is offloaded into the Graphics Translation Table (GTT). This allows the GuC to read GTT-loaded binary blob from shared framebuffer memory so that it may boot.

Scheduling

Command Submission

Execlist

Execlist executes commands synchronously on the device without an intermediate microcontroller. This is the preferred method of executing commands by some driver developers because of it's stability and transparency under current i915 development.

GuC

GuC provides an execution pathway with an intermediate microcontroller providing a scheduling abstraction for Intel's preferred internal scheduling model.

In-VM Scheduling

vExeclist

The vExeclist is a method to submit commands directly to the GPU without the use of an intermediate microcontroller.

vGuC

vGuC is a command submission interface used to process commands to the Intel Graphics Microcontroller (GuC).

Between-VM Scheduling

Memory Management

i915 Clients

Processes which make use of the Intel i915 driver receive an i915 Client ID.

Translation Tables
GTT (Graphics Translation Table)

GPU Memory on-device is a part of a GTT or Graphics Translation Table. This table stores information globally for all graphics processes within the system. Some processes access the GTT such as DRI while other's receive a Per Process Graphics Translation Table (PPGTT) buffer based on their i915 Client ID.

GGTT (Global Graphics Translation Table)
PPGTT (Per Process Graphics Translation Table)

Process-specific memory buffers are stored inside a Per Process Graphics Translation Table or PPGTT. This is an GPU MMU translated subregion or IOVA of global GPU memory specific to a GPU process's client ID.

Aliasing PPGTT
Real PPGTT

OpenRM

High Level Architecture

The Open Resource Manager (RM driver) makes use of a highly object oriented paradigm comprised of multiple "engines" which act as micro-services for servicing driver requests.

The Open Resource Manager driver (also known as OpenRM) refers to Nvidia's open-kernel-modules.

Broadly speaking the OpenRM driver consists of two parts.

  • The Platform RM (OpenRM)
  • The Firmware RM (GSP RM / RM Core)

The Platform RM is loaded into the Linux Kernel as nvidia.ko. This module communicates with the GSP RM for via Remote Procedure Calls (RPCs) to communicate with engines from the RM Core.

Initialization

During bring up of the hardware several binary blobs are loaded from embedded Boot ROM memory to bootstrap embedded controller bring up from which point additional software is loaded from onboard SPI flash memory.

Software loaded from SPI flash is necessary for the full initialization of the Falcon/NvRISC processor as well as a cached version of the software necessary to run the GPU System Processor (GSP).

Once the platform is posted it is ready to communicate with the host platform's RM driver. The OpenRM driver offloads a binary blob containing the RM Core to the GPU System Processor (GSP) which is likely to contain a more recent version than the cached version contained in on-board SPI flash.

Scheduling

There are several abstractions for GPU virtual machine scheduling. Those are as follows:

  • Guest kernel (nvidia.ko)
  • Host kernel (nvidia.ko)
  • Host usermode (gpu-mgr / libnvidiavgpu.so)
  • Device Firmware (GSP)

Command Submission

Runlist

In-VM Scheduling

Virtual machines contain their own GPU scheduling within the Nvidia kernel module in the guest OS.

Guest kernel (nvidia.ko)

Within a virtual machine running the Nvidia driver messages to the GPU are first sent to the guest nvidia.ko kernel module.

The guest then determines whether a vRPC (virtual Remote Procedure Call) or a pRPC (physical Remote Procedure Call) should be sent. Both pRPCs and vRPCs are sent through the host RM driver (Resource Manager).

Note: Unclear on execution pathway for pRPCs vs vRPCs. pRPCs may go directly to device.

Between-VM Scheduling

In addition to scheduling which occurs within the virtual machine the Resource Manager driver also schedules messages to the GPU between GPU-accelerated virtual machines and host processes.

Host kernel (nvidia.ko)

Messages sent by the guest (via vRPC or pRPC) are received by the host Nvidia.ko driver.

Nvidia.ko contains a virtual GPU state machine which contains status information for the virtual GPU.

Nvidia.ko also contains a virtual GPU kernel scheduler which interacts with virtual GPU objects.

Nvidia.ko also contains an RM Call scheduler which schedules calls on an RM class.

The Nividia.ko kernel module exits to userspace to execute the gpu-mgr and VMIOP (Virtual Machine Input Output Plugin).

Host usermode (gpu-mgr / libnvidiavgpu.so)

After exiting to userspace a daemon process (the gpu-mgr) and a library (libnvidiavgpu.so).

gpu-mgr

The gpu-mgr is a process which provides the spawning of virtual GPU stubs and population with capability information.

vmiop

VMIOP (Virtual Machine Input Output Plugin) handles presenting virtualized functionality into the guest.

The Virtual Machine Input Output Plugin software handles virtual displays, compute API offload, and most importantly BAR (Base Address Register) quirks.

The VMIOP is an SDK (Software Development Kit) provided in binary format split between the libnvidiavgpu.so and nvidia-vgpu-mgr which provides userland helper functions for GPU virtualization.

Messages to the VMIOP are scheduled by Linux kernel niceness (scheduling abstraction).

Device Firmware (GSP)

Messages received by the host RM driver (Resource Manager) are then scheduled by the RM Core contained within the GSP (GPU System Processor). The GSP handles the device's internal scheduling model.

Memory Management

RM Clients

Processes which make use of the RM driver receive an RM Client ID.

Translation Tables
vmiop_gva

amdgpu

Citations (Talks and Reading Material)

  1. Intel Graphics Programmer's Reference Manuals (PRM)
  2. i915: Hardware Contexts (and some bits about batchbuffers)
  3. i915: The Global GTT Part 1
  4. i915: Aliasing PPGTT Part 2
  5. i915: True PPGTT Part 3
  6. i915: Future PPGTT Part 4 (Dynamic page table allocations, 64 bit address space, GPU "mirroring", and yeah, something about relocs too)
  7. i915: Security of the Intel Graphics Stack - Part 1 - Introduction
  8. i915: Security of the Intel Graphics Stack - Part 2 - FW <-> GuC
  9. i915: An Introduction to Intel GVT-g (with new architecture)