An absence of critical technical documentation has historically slowed growth and adoption of developer ecosystems for GPU virtualization.
This CC-BY-4.0 licensed content can either be used with attribution, or used as inspiration for new documentation, created by GPU vendors for public commercial distribution as developer documentation.
Where possible, this documentation will clearly label dates and versions of observed-but-not-guaranteed behaviour vs. vendor-documented stable interfaces/behaviour with guarantees of forward or backward compatibility.
This article will attempt to provide a glossary of common terms that are either related to/or often used in connection with virtualization for those who may need more context.
If you have suggestions on how to improve the glossary you can contribute or join our community discussion thread to make suggestions.
Kernel.org defines VFIO or Virtual Function Input Output as "an IOMMU/device agnostic framework for exposing direct device access to userspace, in a secure, IOMMU protected environment." 
Kernel.org defines Virtual Function I/O (VFIO) Mediated devices as: "an IOMMU/device-agnostic framework for exposing direct device access to user space in a secure, IOMMU-protected environment... mediated core driver provides a common interface for mediated device management that can be used by drivers of different devices." 
Virtual CPU (vCPU)
A Virtual CPU (vCPU) is a CPU which has been virtualized to represent a percentage of the total hardware resource via preemptive scheduling.
Virtual GPU (vGPU)
A Virtual GPU (vGPU) is a GPU which has been virtualized to represent a percentage of the total hardware resource via preemptive scheduling or thread pinning (to EUs, or SMs) and/or a partition of the device's Video Random Access Memory (VRAM).
According to QEMU's GitHub repo: "The Inter-VM shared memory device (ivshmem) is designed to share a memory region between multiple QEMU processes running different guests and the host. In order for all guests to be able to pick up the shared memory area, it is modeled by QEMU as a PCI device exposing said memory to the guest as a PCI BAR." 
KVMFR (Looking Glass)
looking-glass.io defines Looking Glass as: "an open source application that allows the use of a KVM (Kernel-based Virtual Machine) configured for VGA PCI Pass-through without an attached physical monitor, keyboard or mouse." 
Infogalactic defines NUMA as: "Non-uniform memory access (NUMA) is a computer memory design used in multiprocessing, where the memory access time depends on the memory location relative to the processor. Under NUMA, a processor can access its own local memory faster than non-local memory (memory local to another processor or memory shared between processors)." 
Application Binary Interface (ABI)
Infogalactic defines an ABI as "the interface between two program modules, one of which is often a library or operating system, at the level of machine code." 
docs.microsoft.com defines SR-IOV as "The single root I/O virtualization (SR-IOV) interface is an extension to the PCI Express (PCIe) specification. SR-IOV allows a device, such as a network adapter, to separate access to its resources among various PCIe hardware functions." 
wiki.archlinux.org defines Intel GVT-g as "a technology that provides mediated device passthrough for Intel GPUs (Broadwell and newer). It can be used to virtualize the GPU for multiple guest virtual machines, effectively providing near-native graphics performance in the virtual machine and still letting your host use the virtualized GPU normally." 
lwn.net defines Virgl as: "a way for guests running in a virtual machine (VM) to access the host GPU using OpenGL and other APIs... The virgl stack consists of an application running in the guest that sends OpenGL to the Mesa virgl driver, which uses the virtio-gpu driver in the guest kernel to communicate with QEMU on the host."