x86 virtualization is the use of
hardware-assisted virtualization capabilities on an
x86/
x86-64
x86-64 (also known as x64, x86_64, AMD64, and Intel 64) is a 64-bit extension of the x86 instruction set architecture, instruction set. It was announced in 1999 and first available in the AMD Opteron family in 2003. It introduces two new ope ...
CPU.
In the late 1990s x86 virtualization was achieved by complex
software
Software consists of computer programs that instruct the Execution (computing), execution of a computer. Software also includes design documents and specifications.
The history of software is closely tied to the development of digital comput ...
techniques, necessary to compensate for the processor's lack of hardware-assisted virtualization capabilities while attaining reasonable
performance
A performance is an act or process of staging or presenting a play, concert, or other form of entertainment. It is also defined as the action or process of carrying out or accomplishing an action, task, or function.
Performance has evolved glo ...
. In 2005 and 2006, both
Intel
Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, and Delaware General Corporation Law, incorporated in Delaware. Intel designs, manufactures, and sells computer compo ...
(
VT-x) and
AMD (
AMD-V) introduced limited hardware virtualization support that allowed simpler virtualization software but offered very few speed benefits. Greater hardware support, which allowed substantial speed improvements, came with later processor models.
Software-based virtualization
The following discussion focuses only on virtualization of the x86 architecture
protected mode
In computing, protected mode, also called protected virtual address mode, is an operational mode of x86-compatible central processing units (CPUs). It allows system software to use features such as Memory_segmentation, segmentation, virtual mem ...
.
In protected mode the operating system kernel runs at a higher privilege such as
ring 0, and applications at a lower privilege such as ring 3. In software-based virtualization, a host OS has direct access to hardware while the guest OSs have limited access to hardware, just like any other application of the host OS. One approach used in x86 software-based virtualization to overcome this limitation is called ''ring deprivileging'', which involves running the guest OS at a ring higher (lesser privileged) than 0.
Three techniques made virtualization of protected mode possible:
*
Binary translation is used to rewrite certain ring 0 instructions in terms of ring 3 instructions, such as
POPF, that would otherwise fail silently or behave differently when executed above ring 0,
[ making the classic trap-and-emulate virtualization impossible.] To improve performance, the translated basic block
In compiler construction, a basic block is a straight-line code sequence with no branches in except to the entry and no branches out except at the exit. This restricted form makes a basic block highly amenable to analysis. Compilers usually decom ...
s need to be cached in a coherent way that detects code patching (used in VxDs for instance), the reuse of pages by the guest OS, or even self-modifying code
In computer science, self-modifying code (SMC or SMoC) is source code, code that alters its own instruction (computer science), instructions while it is execution (computing), executing – usually to reduce the instruction path length and imp ...
.
* A number of key data structures used by a processor need to be shadowed. Because most operating systems use paged virtual memory, and granting the guest OS direct access to the MMU would mean loss of control by the virtualization manager, some of the work of the x86 MMU needs to be duplicated in software for the guest OS using a technique known as ''shadow page tables''.[ This involves denying the guest OS any access to the actual page table entries by trapping access attempts and emulating them instead in software. The x86 architecture uses hidden state to store segment descriptors in the processor, so once the segment descriptors have been loaded into the processor, the memory from which they have been loaded may be overwritten and there is no way to get the descriptors back from the processor. ''Shadow descriptor tables'' must therefore be used to track changes made to the descriptor tables by the guest OS.][
* I/O device emulation: Unsupported devices on the guest OS must be emulated by a device emulator that runs in the host OS.][
These techniques incur some performance overhead due to lack of MMU virtualization support, as compared to a VM running on a natively virtualizable architecture such as the ]IBM System/370
The IBM System/370 (S/370) is a range of IBM mainframe computers announced as the successors to the IBM System/360, System/360 family on June 30, 1970. The series mostly maintains backward compatibility with the S/360, allowing an easy migrati ...
.
On traditional mainframes, the classic type 1 hypervisor was self-standing and did not depend on any operating system or run any user applications itself. In contrast, the first x86 virtualization products were aimed at workstation computers, and ran a guest OS inside a host OS by embedding the hypervisor in a kernel module that ran under the host OS (type 2 hypervisor).
There has been some controversy whether the x86 architecture with no hardware assistance is virtualizable as described by Popek and Goldberg. VMware researchers pointed out in a 2006 ASPLOS paper that the above techniques made the x86 platform virtualizable in the sense of meeting the three criteria of Popek and Goldberg, albeit not by the classic trap-and-emulate technique.[
A different route was taken by other systems like ]Denali
Denali (), federally designated as Mount McKinley, is the highest mountain peak in North America, with a summit elevation of above sea level. It is the tallest mountain in the world from base to peak on land, measuring . On p. 20 of Helm ...
, L4, and Xen, known as paravirtualization, which involves porting operating systems to run on the resulting virtual machine, which does not implement the parts of the actual x86 instruction set that are hard to virtualize. The paravirtualized I/O has significant performance benefits as demonstrated in the original SOSP'03 Xen paper.
The initial version of x86-64
x86-64 (also known as x64, x86_64, AMD64, and Intel 64) is a 64-bit extension of the x86 instruction set architecture, instruction set. It was announced in 1999 and first available in the AMD Opteron family in 2003. It introduces two new ope ...
( AMD64) did not allow for a software-only full virtualization due to the lack of segmentation support in long mode, which made the protection of the hypervisor's memory impossible, in particular, the protection of the trap handler that runs in the guest kernel address space. Revision D and later 64-bit AMD processors (as a rule of thumb, those manufactured in 90 nm or less) added basic support for segmentation in long mode, making it possible to run 64-bit guests in 64-bit hosts via binary translation. Intel did not add segmentation support to its x86-64 implementation ( Intel 64), making 64-bit software-only virtualization impossible on Intel CPUs, but Intel VT-x support makes 64-bit hardware assisted virtualization possible on the Intel platform.
On some platforms, it is possible to run a 64-bit guest on a 32-bit host OS if the underlying processor is 64-bit and supports the necessary virtualization extensions.
Hardware-assisted virtualization
In 2005 and 2006, Intel
Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, and Delaware General Corporation Law, incorporated in Delaware. Intel designs, manufactures, and sells computer compo ...
and AMD (working independently) created new processor extensions to the x86 architecture. The first generation of x86 hardware virtualization addressed the issue of privileged instructions. The issue of low performance of virtualized system memory was addressed with MMU virtualization that was added to the chipset later.
Central processing unit
Virtual 8086 mode
Because the Intel 80286
The Intel 80286 (also marketed as the iAPX 286 and often called Intel 286) is a 16-bit microprocessor that was introduced on February 1, 1982. It was the first 8086-based CPU with separate, non- multiplexed address and data buses and also the f ...
could not run concurrent DOS applications well by itself in protected mode, Intel introduced the virtual 8086 mode in their 80386 chip, which offered virtualized 8086 processors on the 386 and later chips. Hardware support for virtualizing the protected mode itself, however, became available 20 years later.
AMD virtualization (AMD-V)
AMD developed its first generation virtualization extensions under the code name "Pacifica", and initially published them as AMD Secure Virtual Machine (SVM), but later marketed them under the trademark ''AMD Virtualization'', abbreviated ''AMD-V''.
On May 23, 2006, AMD released the Athlon 64 ( "Orleans"), the Athlon 64 X2 ( "Windsor") and the Athlon 64 FX ( "Windsor") as the first AMD processors to support this technology.
AMD-V capability also features on the Athlon 64 and Athlon 64 X2 family of processors with revisions "F" or "G" on socket AM2, Turion 64 X2
AMD Turion is the brand name AMD applies to its x86-64 low-power consumption mobile processors codenamed ''K8L''. The Turion 64 and Turion 64 X2/Ultra processors compete with Intel Corporation, Intel's mobile processors, initially the ''Pentium ...
, and Opteron 2nd generation and third-generation, Phenom and Phenom II
Phenom II is a family of AMD's multi-core 45 nm central processing unit, processors using the AMD K10 microarchitecture, succeeding the original AMD Phenom, Phenom. Advanced Micro Devices released the Socket AM2+ version of Phenom II in Dece ...
processors. The APU Fusion processors support AMD-V. AMD-V is not supported by any Socket 939 processors. The only Sempron
Sempron has been the marketing name used by AMD for several different budget desktop CPUs, using several different technologies and CPU socket formats. The Sempron replaced the AMD Duron processor and competed against Intel's Celeron#Celeron D (Pr ...
processors which support it are APUs and Huron, Regor, Sargas desktop CPUs.
AMD Opteron CPUs beginning with the Family 0x10 Barcelona line, and Phenom II CPUs, support a second generation hardware virtualization technology called Rapid Virtualization Indexing (formerly known as Nested Page Tables during its development), later adopted by Intel as Extended Page Tables (EPT).
As of 2019, all Zen
Zen (; from Chinese: ''Chán''; in Korean: ''Sŏn'', and Vietnamese: ''Thiền'') is a Mahayana Buddhist tradition that developed in China during the Tang dynasty by blending Indian Mahayana Buddhism, particularly Yogacara and Madhyamaka phil ...
-based AMD processors support AMD-V.
The CPU flag for AMD-V is "svm". This may be checked in BSD derivatives via dmesg or sysctl and in Linux
Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
via /proc/ cpuinfo
.[ Instructions in AMD-V include VMRUN, VMLOAD, VMSAVE, CLGI, VMMCALL, INVLPGA, SKINIT, and STGI.
With some ]motherboard
A motherboard, also called a mainboard, a system board, a logic board, and informally a mobo (see #Nomenclature, "Nomenclature" section), is the main printed circuit board (PCB) in general-purpose computers and other expandable systems. It ho ...
s, users must enable AMD SVM feature in the BIOS setup before applications can make use of it.
Intel virtualization (VT-x)
Previously codenamed "Vanderpool", VT-x represents Intel's technology for virtualization on the x86 platform. On November 14, 2005, Intel released two models of Pentium 4
Pentium 4 is a series of single-core central processing unit, CPUs for Desktop computer, desktops, laptops and entry-level Server (computing), servers manufactured by Intel. The processors were shipped from November 20, 2000 until August 8, 20 ...
(Model 662 and 672) as the first Intel processors to support VT-x. The CPU flag for VT-x capability is "vmx"; in Linux, this can be checked via /proc/cpuinfo
, or in macOS
macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...
via sysctl machdep.cpu.features
.[To see if your processor supports hardware virtualization]
Intel 2012.
"VMX" stands for Virtual Machine Extensions, which adds 13 new instructions: VMPTRLD, VMPTRST, VMCLEAR, VMREAD, VMWRITE, VMCALL, VMLAUNCH, VMRESUME, VMXOFF, VMXON, INVEPT, INVVPID, and VMFUNC. These instructions permit entering and exiting a virtual execution mode where the guest OS perceives itself as running with full privilege (ring 0), but the host OS remains protected.
, almost all newer server, desktop and mobile Intel processors support VT-x, with some of the Intel Atom processors as the primary exception. With some motherboard
A motherboard, also called a mainboard, a system board, a logic board, and informally a mobo (see #Nomenclature, "Nomenclature" section), is the main printed circuit board (PCB) in general-purpose computers and other expandable systems. It ho ...
s, users must enable Intel's VT-x feature in the BIOS setup before applications can make use of it.
Intel started to include Extended Page Tables (EPT), a technology for page-table virtualization, since the Nehalem architecture, released in 2008. In 2010, Westmere added support for launching the logical processor directly in real mode a feature called "unrestricted guest", which requires EPT to work.
Since the Haswell microarchitecture (announced in 2013), Intel started to include ''VMCS shadowing'' as a technology that accelerates nested virtualization of VMMs.
The ''virtual machine control structure'' (VMCS) is a data structure
In computer science, a data structure is a data organization and storage format that is usually chosen for Efficiency, efficient Data access, access to data. More precisely, a data structure is a collection of data values, the relationships amo ...
in memory that exists exactly once per VM, while it is managed by the VMM. With every change of the execution context between different VMs, the VMCS is restored for the current VM, defining the state of the VM's virtual processor. As soon as more than one VMM or nested VMMs are used, a problem appears in a way similar to what required shadow page table management to be invented, as described above. In such cases, VMCS needs to be shadowed multiple times (in case of nesting) and partially implemented in software in case there is no hardware support by the processor. To make shadow VMCS handling more efficient, Intel implemented hardware support for VMCS shadowing.
VIA virtualization (VIA VT)
VIA Nano 3000 Series Processors and higher support VIA VT virtualization technology compatible with Intel VT-x. EPT is present in Zhaoxin ZX-C, a descendant of VIA QuadCore-E & Eden X4 similar to Nano C4350AL.
Interrupt virtualization (AMD AVIC and Intel APICv)
In 2012, AMD announced their ''Advanced Virtual Interrupt Controller'' (''AVIC'') targeting interrupt overhead reduction in virtualization environments. This technology, as announced, does not support x2APIC.
In 2016, AVIC is available on the AMD family 15h models 6Xh
(Carrizo) processors and newer.
Also in 2012, Intel announced a similar technology for interrupt and APIC virtualization, which did not have a brand name at its announcement time.
Later, it was branded as ''APIC virtualization'' (''APICv'')
and it became commercially available in the Ivy Bridge EP series of Intel CPUs, which is sold as Xeon E5-26xx v2 (launched in late 2013) and as Xeon E5-46xx v2 (launched in early 2014).
Graphics processing unit
Graphics virtualization is not part of the x86 architecture. Intel Graphics Virtualization Technology (GVT) provides graphics virtualization as part of more recent Gen graphics architectures. Although AMD APUs implement the x86-64
x86-64 (also known as x64, x86_64, AMD64, and Intel 64) is a 64-bit extension of the x86 instruction set architecture, instruction set. It was announced in 1999 and first available in the AMD Opteron family in 2003. It introduces two new ope ...
instruction set, they implement AMD's own graphics architectures ( TeraScale, GCN and RDNA) which do not support graphics virtualization. Larrabee was the only graphics microarchitecture based on x86, but it likely did not include support for graphics virtualization.
Chipset
Memory and I/O virtualization is performed by the chipset
In a computer system, a chipset is a set of electronic components on one or more integrated circuits that manages the data flow between the processor, memory and peripherals. The chipset is usually found on the motherboard of computers. Chips ...
. Typically these features must be enabled by the BIOS, which must be able to support them and also be set to use them.
I/O MMU virtualization (AMD-Vi and Intel VT-d)
An input/output memory management unit (IOMMU) allows guest virtual machine
In computing, a virtual machine (VM) is the virtualization or emulator, emulation of a computer system. Virtual machines are based on computer architectures and provide the functionality of a physical computer. Their implementations may involve ...
s to directly use peripheral
A peripheral device, or simply peripheral, is an auxiliary hardware device that a computer uses to transfer information externally. A peripheral is a hardware component that is accessible to and controlled by a computer but is not a core compo ...
devices, such as Ethernet, accelerated graphics cards, and hard-drive controllers, through DMA and interrupt
In digital computers, an interrupt (sometimes referred to as a trap) is a request for the processor to ''interrupt'' currently executing code (when permitted), so that the event can be processed in a timely manner. If the request is accepted ...
remapping. This is sometimes called ''PCI passthrough''.
An IOMMU also allows operating systems to eliminate bounce buffers needed to allow themselves to communicate with peripheral devices whose memory address spaces are smaller than the operating system's memory address space, by using memory address translation. At the same time, an IOMMU also allows operating systems and hypervisors to prevent buggy or malicious hardware from compromising memory security.
Both AMD and Intel have released their IOMMU specifications:
* AMD's I/O Virtualization Technology, "AMD-Vi", originally called "IOMMU"
* Intel's "Virtualization Technology for Directed I/O" (VT-d), included in most high-end (but not all) newer Intel processors since the Core 2 architecture.
In addition to the CPU support, both motherboard
A motherboard, also called a mainboard, a system board, a logic board, and informally a mobo (see #Nomenclature, "Nomenclature" section), is the main printed circuit board (PCB) in general-purpose computers and other expandable systems. It ho ...
chipset
In a computer system, a chipset is a set of electronic components on one or more integrated circuits that manages the data flow between the processor, memory and peripherals. The chipset is usually found on the motherboard of computers. Chips ...
and system firmware ( BIOS or UEFI) need to fully support the IOMMU I/O virtualization functionality for it to be usable. Only the PCI or PCI Express
PCI Express (Peripheral Component Interconnect Express), officially abbreviated as PCIe, is a high-speed standard used to connect hardware components inside computers. It is designed to replace older expansion bus standards such as Peripher ...
devices supporting ''function level reset'' (FLR) can be virtualized this way, as it is required for reassigning various device functions between virtual machines. If a device to be assigned does not support Message Signaled Interrupts (MSI), it must not share interrupt
In digital computers, an interrupt (sometimes referred to as a trap) is a request for the processor to ''interrupt'' currently executing code (when permitted), so that the event can be processed in a timely manner. If the request is accepted ...
lines with other devices for the assignment to be possible.
All conventional PCI devices routed behind a PCI/ PCI-X-to-PCI Express bridge can be assigned to a guest virtual machine only all at once; PCI Express devices have no such restriction.
Network virtualization (VT-c)
* Intel's "Virtualization Technology for Connectivity" (VT-c).
= PCI-SIG Single Root I/O Virtualization (SR-IOV)
=
''PCI-SIG Single Root I/O Virtualization (SR-IOV)'' provides a set of general (non-x86 specific) I/O virtualization methods based on PCI Express
PCI Express (Peripheral Component Interconnect Express), officially abbreviated as PCIe, is a high-speed standard used to connect hardware components inside computers. It is designed to replace older expansion bus standards such as Peripher ...
(PCIe) native hardware, as standardized by PCI-SIG:
* ''Address translation services (ATS)'' supports native IOV across PCI Express via address translation. It requires support for new transactions to configure such translations.
* '' Single-root IOV (SR-IOV or SRIOV)'' supports native IOV in existing single-root complex PCI Express topologies. It requires support for new device capabilities to configure multiple virtualized configuration spaces.
* ''Multi-root IOV (MR-IOV)'' supports native IOV in new topologies (for example, blade servers) by building on SR-IOV to provide multiple root complexes which share a common PCI Express hierarchy.
In SR-IOV, the most common of these, a host VMM configures supported devices to create and allocate virtual "shadows" of their configuration spaces so that virtual machine guests can directly configure and access such "shadow" device resources. With SR-IOV enabled, virtualized network interfaces are directly accessible to the guests,
avoiding involvement of the VMM and resulting in high overall performance; for example, SR-IOV achieves over 95% of the bare metal
In information technology, bare machine (or bare-metal computer) is a computer which has no operating system. The software executed by a bare machine, commonly called a "bare metal program" or "bare metal application", is designed to interact dir ...
network bandwidth in NASA
The National Aeronautics and Space Administration (NASA ) is an independent agencies of the United States government, independent agency of the federal government of the United States, US federal government responsible for the United States ...
's virtualized datacenter and in the Amazon Public Cloud.
See also
* Comparison of application virtualization software
* Comparison of platform virtualization software
* Hardware-assisted virtualization
* Hypervisor
* I/O virtualization
* Network virtualization
* OS-level virtualization
* Timeline of virtualization development
* Virtual machine
In computing, a virtual machine (VM) is the virtualization or emulator, emulation of a computer system. Virtual machines are based on computer architectures and provide the functionality of a physical computer. Their implementations may involve ...
* List of IOMMU-supporting hardware
* Second Level Address Translation (SLAT)
* Message Signaled Interrupts (MSI)
References
External links
Everything You Need to Know About the Intel Virtualization Technology
Archived a
ghostarchive.org
at 10 May 2022
A special course at the University of San Francisco on Intel EM64T and VT Extensions
(2007)
{{DEFAULTSORT:X86 Virtualization
X86 architecture
Hardware virtualization