Difference between revisions of "GPU Driver Internals"
Line 4: | Line 4: | ||
=== High Level Architecture === | === High Level Architecture === | ||
=== Initialization === | |||
During initialization of the i915 driver the GuC binary blob is offloaded into the Graphics Translation Table (GTT). This allows the GuC to read GTT-loaded binary blob from shared framebuffer memory so that it may boot. | |||
=== Scheduling === | === Scheduling === | ||
Line 39: | Line 42: | ||
=== High Level Architecture === | === High Level Architecture === | ||
The Open Resource Manager driver (also known as [https://open-iov.org/index.php/OpenRM OpenRM]) refers to Nvidia's [https://github.com/NVIDIA/open-gpu-kernel-modules open-kernel-modules]. | |||
Broadly speaking the OpenRM driver consists of two parts. | |||
* The Driver RM | |||
* The Firmware RM | |||
=== Initialization === | |||
During bring up of the hardware several binary blobs are loaded from embedded boot-ROM memory to bootstrap embedded controller bring up from which point additional software is loaded from onboard [[wikipedia:Serial_Peripheral_Interface|SPI]] flash memory. Software loaded from SPI flash is necessary for the full initialization of the Falcon/NvRISC processor as well as a cached version of the software necessary to run the GPU System Processor (GSP). Once the platform is posted it is ready to communicate with the host platform's RM driver. The OpenRM driver offloads a binary blob containing the RM Core to the [https://open-iov.org/index.php/GPU_Firmware#GSP GPU System Processor (GSP)] which is likely to contain a more recent version than the cached version contained in on-board SPI flash. | |||
=== Scheduling === | === Scheduling === |
Revision as of 21:59, 9 December 2022
This page will detail the internals of various GPU drivers for use with I/O Virtualization.
i915
High Level Architecture
Initialization
During initialization of the i915 driver the GuC binary blob is offloaded into the Graphics Translation Table (GTT). This allows the GuC to read GTT-loaded binary blob from shared framebuffer memory so that it may boot.
Scheduling
In-VM Scheduling
vExeclist
The vExeclist is a method to submit commands directly to the GPU without the use of an intermediate microcontroller.
vGuC
vGuC is a command submission interface used to process commands to the Intel Graphics Microcontroller (GuC).
Between-VM Scheduling
Memory Management
i915 Clients
Processes which make use of the Intel i915 driver receive an i915 Client ID.
Translation Tables
GTT (Graphics Translation Table)
GPU Memory on-device is a part of a GTT or Graphics Translation Table. This table stores information globally for all graphics processes within the system. Some processes access the GTT such as DRI while other's receive a Per Process Graphics Translation Table (PPGTT) buffer based on their i915 Client ID.
GGTT (Global Graphics Translation Table)
PPGTT (Per Process Graphics Translation Table)
Process-specific memory buffers are stored inside a Per Process Graphics Translation Table or PPGTT. This is an GPU MMU translated subregion or IOVA of global GPU memory specific to a GPU process's client ID.
Aliasing PPGTT
Real PPGTT
OpenRM
High Level Architecture
The Open Resource Manager driver (also known as OpenRM) refers to Nvidia's open-kernel-modules.
Broadly speaking the OpenRM driver consists of two parts.
- The Driver RM
- The Firmware RM
Initialization
During bring up of the hardware several binary blobs are loaded from embedded boot-ROM memory to bootstrap embedded controller bring up from which point additional software is loaded from onboard SPI flash memory. Software loaded from SPI flash is necessary for the full initialization of the Falcon/NvRISC processor as well as a cached version of the software necessary to run the GPU System Processor (GSP). Once the platform is posted it is ready to communicate with the host platform's RM driver. The OpenRM driver offloads a binary blob containing the RM Core to the GPU System Processor (GSP) which is likely to contain a more recent version than the cached version contained in on-board SPI flash.
Scheduling
In-VM Scheduling
gpu-mgr
vmiop
Between-VM Scheduling
nvidia.ko
Memory Management
RM Clients
Processes which make use of the RM driver receive an RM Client ID.
Translation Tables
vmiop_gva
amdgpu
Citations (Talks and Reading Material)
- Intel Graphics Programmer's Reference Manuals (PRM)
- i915: Hardware Contexts (and some bits about batchbuffers)
- i915: The Global GTT Part 1
- i915: Aliasing PPGTT Part 2
- i915: True PPGTT Part 3
- i915: Future PPGTT Part 4 (Dynamic page table allocations, 64 bit address space, GPU "mirroring", and yeah, something about relocs too)
- i915: Security of the Intel Graphics Stack - Part 1 - Introduction
- i915: Security of the Intel Graphics Stack - Part 2 - FW <-> GuC
- i915: An Introduction to Intel GVT-g (with new architecture)