Instructions that have at some point been present as documented instructions in one or more

x86 x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel, based on the 8086 microprocessor and its 8-bit-external-bus variant, the 8088. Th ...

processors, but where the processor series containing the instructions are discontinued or superseded, with no known plans to reintroduce the instructions.

Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, and Delaware General Corporation Law, incorporated in Delaware. Intel designs, manufactures, and sells computer compo ...
instructions

i386 The Intel 386, originally released as the 80386 and later renamed i386, is the third-generation x86 architecture microprocessor from Intel. It was the first 32-bit processor in the line, making it a significant evolution in the x86 archite ...
instructions

The following instructions were introduced in the Intel 80386, but later discontinued:

Itanium Itanium (; ) is a discontinued family of 64-bit computing, 64-bit Intel microprocessors that implement the Intel Itanium architecture (formerly called IA-64). The Itanium architecture originated at Hewlett-Packard (HP), and was later jointly dev ...
instructions

These instructions are only present in the x86 operation mode of early Intel Itanium processors with hardware support for x86. This support was added in "Merced" and removed in "Montecito", replaced with

software emulation In computing, an emulator is hardware or software that enables one computer system (called the ''host'') to behave like another computer system (called the ''guest''). An emulator typically enables the host system to run software or use perip ...

MPX instructions

These instructions were introduced in 6th generation Intel Core "Skylake" CPUs. The last CPU generation to support them was the 9th generation Core "

Coffee Lake Coffee Lake is Intel's codename for its eighth-generation Core microprocessor family, announced on September 25, 2017. It is manufactured using Intel's second 14 nm process node refinement. Desktop Coffee Lake processors introduced i5 and i7 CP ...

" CPUs. Intel MPX adds 4 new registers, BND0 to BND3, that each contains a pair of addresses. MPX also defines a bounds-table as a 2-level directory/table data structure in memory that contains sets of upper/lower bounds.

Hardware Lock Elision

The Hardware Lock Elision feature of Intel TSX is marked in the Intel SDM as removed from 2019 onwards.Inte
SDM, volume 1
order no. 253665-083, mar 2024, chapter 2.5 This feature took the form of two instruction prefixes, XACQUIRE and XRELEASE, that could be attached to memory atomics/stores to elide the memory locking that they represent.

VP2Intersect instructions

The VP2INTERSECT instructions (an AVX-512 subset) were introduced in

Tiger Lake Tiger Lake is Intel's codename for the 11th generation Intel Core mobile processors based on the Willow Cove Core microarchitecture, manufactured using Intel's third-generation 10 nm process node known as 10SF ("10 nm SuperFin"). Tiger L ...

(11th generation mobile Core processors), but were never officially supported on any other Intel processors - they are now considered deprecated and are listed in the Intel SDM as removed from 2023 onwards. As of July 2024, the VP2INTERSECT instructions have been re-introduced on AMD

Zen 5 Zen 5 (''"Nirvana"'') is the name for a CPU microarchitecture by AMD, shown on their roadmap in May 2022, launched for mobile in July 2024 and for desktop in August 2024. It is the successor to Zen 4 and is currently fabricated on TSMC's 5 nm ...

processors.

Instructions specific to Xeon Phi processors

"Knights Corner" instructions

The first generation

Xeon Phi Xeon Phi is a discontinued series of x86 manycore processors designed and made by Intel. It was intended for use in supercomputers, servers, and high-end workstations. Its architecture allowed use of standard programming languages and applicati ...

processors, codenamed "Knights Corner" (KNC), supported a large number of instructions that are not seen in any later x86 processor. An instruction reference is available − the instructions/opcodes unique to KNC are the ones with VEX and MVEX prefixes (except for the KMOV, KNOT and KORTEST instructions − these are kept with the same opcodes and function in AVX-512, but with an added "W" appended to their instruction names). Most of these KNC-unique instructions are similar but not identical to instructions in

AVX-512 AVX-512 are 512-bit extensions to the 256-bit Advanced Vector Extensions SIMD instructions for x86 instruction set architecture (ISA) proposed by Intel in July 2013, and first implemented in the 2016 Intel Xeon Phi x200 (Knights Landing), and then ...

− later Xeon Phi processors replaced these instructions with AVX-512. Early versions of AVX-512 avoided the instruction encodings used by KNC's MVEX prefix, however with the introduction of Intel APX (

Advanced Performance Extensions x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel, based on the 8086 microprocessor and its 8-bit-external-bus variant, the 8088. The ...

) in 2023, some of the old KNC MVEX instruction encodings have been reused for new APX encodings. For example, both KNC and APX accept the instruction encoding as valid, but assign different meanings to it: * KNC: - vector load with data conversion * APX: - vector load with one of the new APX extended- GPRs used as scaled index

"Knights Landing" and "Knights Mill" instructions

Some of the AVX-512 instructions in the

"Knights Landing" and later models belong to the

subsets "AVX512ER", "AVX512_4FMAPS", "AVX512PF" and "AVX512_4VNNIW", all of which are unique to the Xeon Phi series of processors. The ER and PF subsets were introduced in "Knights Landing" − the 4FMAPS and 4VNNIW instructions were later added in "Knights Mill". The ER and 4FMAPS instructions are floating-point arithmetic instructions that all follow a given pattern where: * EVEX.W is used to specify floating-point format (0=FP32, 1=FP64) * The bottom opcode bit is used to select between packed and scalar operation (0: packed, 1:scalar) * For a given operation, all the scalar/packed variants belong to the same AVX-512 subset. * The instructions all support result masking by opmask registers. The AVX512ER instructions also all support broadcast of memory operands. * The only supported vector width is 512 bits. The AVX512PF instructions are a set of 16 prefetch instructions. These instructions all use VSIB encoding, where a memory addressing mode using the SIB byte is required, and where the index part of the SIB byte is taken to index into the AVX512 vector register file rather than the GPR register file. The selected AVX512 vector register is then interpreted as a vector of indexes, causing the standard x86 base+index+displacement address calculation to be performed for each vector lane, causing one associated memory operation (prefetches in case of the AVX512PF instructions) to be performed for each active lane. The instruction encodings all follow a pattern where: * EVEX.W is used to specify format of the prefetchable data (0:FP32, 1:FP64) * The bottom bit of the opcode is used to indicate whether the AVX512 index register is considered a vector of sixteen signed 32-bit indexes (bit 0 not set) or eight signed 64-bit indexes (bit 0 set) * The instructions all support operation masking by opmask registers. * The only supported vector width is 512 bits. The AVX512_4VNNIW instructions read a 128-bit data item from memory, containing 4 two-component vectors (each component being signed 16-bit). Then, for each of 4 consecutive AVX-512 registers, they will, for each 32-bit lane, interpret the lane as a two-component vector (signed 16-bit) and perform a dot-product with the corresponding two-component vector that was read from memory (the first two-component vector from memory is used for the first AVX-512 source register, and so on). These results are then accumulated into a destination vector register. Xeon Phi processors (from Knights Landing onwards) also featured the PREFETCHWT1 m8 instruction (opcode 0F 0D /2, prefetch into L2 cache with intent to write) − these were the only Intel CPUs to officially support this instruction, but it continues to be supported on some non-Intel processors (e.g.

Zhaoxin Zhaoxin (Shanghai Zhaoxin Semiconductor Co., Ltd.; , ) is a fabless semiconductor company, created in 2013 as a joint venture between VIA Technologies and the Shanghai Municipal Government. The company manufactures x86-compatible desktop and ...

YongFeng).

AMD Advanced Micro Devices, Inc. (AMD) is an American multinational corporation and technology company headquartered in Santa Clara, California and maintains significant operations in Austin, Texas. AMD is a hardware and fabless company that de ...
instructions

Am386 The Am386 CPU is a 100%-compatible clone of the Intel 80386 design released by AMD in March 1991. It sold millions of units, positioning AMD as a legitimate competitor to Intel, rather than being merely a second source for ''x86'' CPUs (then te ...
SMM instructions

A handful of instructions to support

System Management Mode System Management Mode (SMM, sometimes called ring −2 in reference to protection rings) is an operating mode of x86 central processor units (CPUs) in which all normal execution, including the operating system, is suspended. An alternat ...

were introduced in the Am386SXLV and Am386DXLV processors. They were also present in the later Am486SXLV/DXLV and Elan SC300/310 processors. The SMM functionality of these processors was implemented using Intel

ICE Ice is water that is frozen into a solid state, typically forming at or below temperatures of 0 ° C, 32 ° F, or 273.15 K. It occurs naturally on Earth, on other planets, in Oort cloud objects, and as interstellar ice. As a naturally oc ...

microcode In processor design, microcode serves as an intermediary layer situated between the central processing unit (CPU) hardware and the programmer-visible instruction set architecture of a computer. It consists of a set of hardware-level instructions ...

without a valid license, resulting in a lawsuit that AMD lost in late 1994. As a result of this loss, the ICE microcode was removed from all later AMD CPUs, and the SMM instructions removed with it. These SMM instructions were also present on the

IBM 386SLC The 386SLC is an Intel-licensed version of the 386SX (32-bit internal, 16-bit external, 24-bit memory addressing), developed and manufactured by IBM in 1991. It included power-management capabilities and an 8KB internal CPU cache, which enabled ...

and its derivatives (albeit with the

LOADALL 


LOADALL is the common name for two different  undocumented machine instructions of Intel 80286 and Intel 80386 processors, which allow access to areas of the internal processor state that are normally outside of the IA-32 API scope, like ''descri ...

-like SMM return opcode 0F 07 named ICERET), as well as on the UMC U5S processor.

3DNow! instructions

The 3DNow! instruction set extension was introduced in the AMD

K6-2 The K6-2 is an x86 microprocessor introduced by AMD on May 28, 1998, and available in speeds ranging from 266 to 550 MHz. An enhancement of the original K6, the K6-2 introduced AMD's 3DNow! SIMD instruction set and an upgraded system-bus interf ...

, mainly adding support for floating-point SIMD instructions using the MMX registers (two

FP32 Single-precision floating-point format (sometimes called FP32 or float32) is a computer number format, usually occupying 32 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point. A floati ...

components in a 64-bit vector register). The instructions were mainly promoted by AMD, but were supported on some non-AMD CPUs as well. The processors supporting 3DNow! were: * AMD K6-2, K6-III, and all processors based on the K7, K8 and K10 microarchitectures. (Later AMD microarchitectures such as

Bulldozer A bulldozer or dozer (also called a crawler) is a large tractor equipped with a metal #Blade, blade at the front for pushing material (soil, sand, snow, rubble, or rock) during construction work. It travels most commonly on continuous tracks, ...

Bobcat The bobcat (''Lynx rufus''), also known as the wildcat, bay lynx, or red lynx, is one of the four extant species within the medium-sized wild cat genus '' Lynx''. Native to North America, it ranges from southern Canada through most of the c ...

and

Zen Zen (; from Chinese: ''Chán''; in Korean: ''Sŏn'', and Vietnamese: ''Thiền'') is a Mahayana Buddhist tradition that developed in China during the Tang dynasty by blending Indian Mahayana Buddhism, particularly Yogacara and Madhyamaka phil ...

do not support 3DNow!) * IDT

WinChip The WinChip series is a discontinued CPU electrical consumption, low-power Socket 7-based x86 central processing unit, processor that was designed by Centaur Technology and marketed by its parent company Integrated Device Technology, IDT. Overvie ...

2 and 3 * VIA

Cyrix III Cyrix III is an x86-compatible Socket 370 CPU. VIA Technologies launched the processor in February 2000. VIA had purchased both Centaur Technology and Cyrix. Cyrix III was to be based upon a core from one of the two companies. History The Cyr ...

(both "Joshua" and "Samuel" variants), and the "Samuel" and "Ezra" revisions of

VIA C3 The VIA C3 is a family of x86 central processing units for personal computers designed by Centaur Technology and sold by VIA Technologies. The different CPU cores are built following the design methodology of Centaur Technology. In addition to ...

. (Later VIA CPUs, from C3 "Nehemiah" onwards, dropped 3DNow! in favor of SSE.) * National Semiconductor Geode GX2; AMD Geode GX and LX. 3DNow! also introduced a couple of prefetch instructions: (opcode ) and (opcode ). These instructions, unlike the rest of 3DNow!, are not discontinued but continue to be supported on modern AMD CPUs. The PREFETCHW instruction is also supported on Intel CPUs starting with Pentium 4, albeit executed as NOP until Broadwell.

3DNow+ instructions added with
Athlon AMD Athlon is the brand name applied to a series of x86, x86-compatible microprocessors designed and manufactured by AMD, Advanced Micro Devices. The original Athlon (now called Athlon Classic) was the first seventh-generation x86 processor a ...
and K6-2+

{, class="wikitable" ! Instruction !! Opcode !! Instruction description , - , PF2IW mm1,mm2/m64 , 0F 0F /r 1C , Packed 32-bit floating-point to 16-bit signed integer conversion, with round-to-zero , - , PI2FW mm1,mm2/m64 , 0F 0F /r 0C , Packed 16-bit signed integer to 32-bit floating-point conversion , - , PSWAPD mm1,mm2/m64 , {{nowrap, 0F 0F /r BB{{efn, The PSWAPD instruction uses same opcode as the older undocumented K6-2 PSWAPW instruction.} , Packed Swap Doubleword:

dst 1:0<- src 3:32dst 3:32<- src 1:0/pre>
, -
,  PFNACC mm1,mm2/m64
,  0F 0F /r 8A
,  Packed Floating-Point Negative Accumulate:
dst 1:0<- dst 1:0− dst 3:32dst 3:32<- src 1:0− src 3:32/pre>
, -
,  {{nowrap, PFPNACC mm1,mm2/m64
,  0F 0F /r 8E 
,  Packed Floating-Point Positive-Negative Accumulate:
dst 1:0<- dst 1:0− dst 3:32dst 3:32<- src 1:0+ src 3:32/pre>

{{notelist
{{vpad

  3DNow! instructions specific to  Geode GX and LX 


{,  class="wikitable"
! Instruction !! Opcode !! Instruction description
, -
,  PFRCPV mm1,mm2/m64 , ,  0F 0F /r 86 , ,  Packed Floating-point Reciprocal Approximation
, -
,  {{nowrap, PFRSQRTV mm1,mm2/m64 , ,  {{nowrap, 0F 0F /r 87 , ,  Packed Floating-point Reciprocal Square Root Approximation

{{vpad

  
SSE5 The SSE5 (short for Streaming SIMD Extensions version 5) was a SIMD instruction set extension proposed by AMD on August 30, 2007 as a supplement to the 128-bit  SSE core instructions in the AMD64 architecture.

AMD chose not to implement SSE5 as ori ...
 derived instructions 

SSE5 was a proposed SSE extension by AMD, using a new "DREX" instruction encoding to add support for new 3-operand and 4-operand instructions to SSE. The bundle did not include the full set of Intel's SSE4 instructions, making it a competitor to SSE4 rather than a successor.

AMD chose not to implement SSE5 as originally proposed − it was instead reworked into FMA4 and XOP, which provided similar functionality but with a quite different instruction encoding − using the VEX prefix 


The VEX prefix (from "vector extensions") and VEX coding scheme are an extension to the IA-32 and x86-64 instruction set architecture for microprocessors from Intel, AMD and others.
 Features
The VEX coding scheme allows the definition of new ins ...
 for the FMA4 instructions and the new VEX-like XOP prefix for most of the remaining instructions.

   XOP instructions 

Introduced with the Bulldozer processor core, removed again from Zen (microarchitecture) 



Zen is a family of computer processor microarchitectures from AMD, first launched in February 2017 with the first generation of Ryzen CPUs. It is used in Ryzen (desktop and mobile),  Ryzen Threadripper (workstation and high-end desktop), and Epy ...
 onward.

A revision of most of the SSE5 instruction set.

The XOP instructions mostly make use of the XOP prefix, which is a 3-byte prefix with the following layout:
{,  class="wikitable" style="text-align:center"
!
! scope="col" style="width: 2px; border-spacing:0; padding:0px" rowspan=3 , 
! Byte 0
! scope="col" style="width: 2px; border-spacing:0; padding:0px" rowspan=3 , 
! colspan=8 ,  Byte 1
! scope="col" style="width: 2px; border-spacing:0; padding:0px" rowspan=3 , 
! colspan=8 ,  Byte 2
, -
! Bits
! 7:0
! 7 !! 6 !! 5 !! 4 !! 3 !! 2 !! 1 !! 0
! 7 !! 6 !! 5 !! 4 !! 3 !! 2 !! 1 !! 0
, -
! Usage
,  8Fh
,  R̅ , ,  X̅ , ,  B̅ , ,  colspan=5 ,  mmmmm
,  W , ,  colspan=4 ,  v̅v̅v̅v̅ , ,  L , ,  colspan=2 ,  pp

where:
* Overlines indicate inverted bits.
* The R/X/B bits are argument extension bits similar to the RXB bits of the REX prefix.
* ''mmmmm'' is an opcode-map specifier. While capable of encoding values from 8 to 31 (values 0 to 7 map to ModR/M 
The ModR/M byte is an important part of  instruction encoding for the x86 instruction set.
 Description
Opcodes in x86 are generally one-byte, though two-byte instructions and prefixes exist. ModR/M is a byte that, if required, follows the opcode a ...
-encoded variants of the older POP instruction, making them unusable for XOP), only maps 8, 9 and  0Ah were ever used: map 8 for instructions that take an 8-bit immediate, map 9 for instructions that don't take an immediate, and map 0Ah for instructions that take a 32-bit immediate.
* W is used in a couple of different ways:
** For XOP vector instructions, W is used to swap the last two vector source arguments to the instruction. For instructions that allow W=1, encodings with W=0 allow the second-to-last vector argument to be a memory argument, while encodings with W=1 allow the last vector argument to be a memory argument. For instructions that don't allow their last two vector arguments to be swapped, W is required to be 0.
** For XOP-encoded integer-register instructions (the TBM and LWP instruction set extensions, see below), W is used for operand size. (0=32-bit, 1=64-bit)
* ''vvvv'' is an extra source register argument, normally the first non-r/m source argument for instructions with ≥3 register arguments.
* L is a vector length specifier. L=1 indicates 256-bit operation, L=0 indicates scalar or 128-bit operation.
* ''pp'' is an embedded prefix − nominally 0/1/2/3=none/66h/F2h/F3h, but only 0 was ever used with any of the instructions defined for the XOP prefix.
{{vpad
The XOP instructions encoded with the XOP prefix are as follows:

{{sticky header
{,  class="wikitable sticky-header"
, -
! colspan=2 ,  Instruction description
! Instruction mnemonics
! Opcode
! W=1
swap
allowed
! L=1
(256b)
allowed
, -
! colspan=6 , 
, -
,  rowspan=4 ,  Extract fractional portion of floating-point value.
,  Packed FP32
,  VFRCZPS ymm1,ymm2/m256
,  XOP.9 80 /r
,  {{no , ,  {{yes
, -
,  Packed FP64
,  VFRCZPD ymm1,ymm2/m256
,  XOP.9 81 /r
,  {{no , ,  {{yes
, -
,  Scalar FP32
,  VFRCZSS xmm1,xmm2/m32
,  XOP.9 82 /r
,  {{no , ,  {{no
, -
,  Scalar FP64
,  VFRCZSD xmm1,xmm2/m64
,  XOP.9 83 /r
,  {{no , ,  {{no
, -
! colspan=6 , 
, -
,  colspan=2 ,  Vector per-bit-lane conditional move.

VPCMOV dst,src1,src2,src3 performs the equivalent of {{nowrap, dst <- (src1 AND src3) OR (src2 AND NOT(src3))
,  {{nowrap, VPCMOV ymm1,ymm2,ymm3/m256,ymm4
,  {{nowrap, XOP.8 A2 /r /is4
,  {{yes , ,  {{yes
, -
! colspan=6 , 
, -
,  rowspan=8 ,  Vector integer compare.

For each vector-register lane, compare src1 to src2, then set destination to all-1s if the comparison passes, all-0s if it fails. The imm8 argument specifies comparison function to perform:
* 0: LT (less-than)
* 1: LE (less-than-or-equal)
* 2: GT (greater-than)
* 3: GE (greater-than-or-equal)
* 4: EQ (equal)
* 5: NE (not-equal)
* 6: FALSE (always-false)
* 7: TRUE (always-true)
,  Signed 8-bit lanes
,  VPCOMB xmm1,xmm2,xmm3/m128,imm8{{efn, name=vpcom_alias, text=For each VPCOM* instruction, a series of alias mnemonics are available for the instruction, one for each of the eight comparison functions encodable in the imm8 argument. These alias mnemonics specify the comparison to perform after the "VPCOM" part of the mnemonic. For example:{{nowrap, VPCOMEQB xmm1,xmm2,xmm3 is an alias for {{nowrap, VPCOMB xmm1,xmm2,xmm3,4
{{nowrap, VPCOMFALSEUQ xmm1,xmm2, bx/code> is an alias for {{nowrap, VPCOMUQ xmm1,xmm2, bx6

,  XOP.8 CC /r ib
,  rowspan=8 {{no
,  rowspan=8 {{no
, -
,  Signed 16-bit lanes
,  VPCOMW xmm1,xmm2,xmm3/m128,imm8{{efn, name=vpcom_alias
,  XOP.8 CD /r ib
, -
,  Signed 32-bit lanes
,  VPCOMD xmm1,xmm2,xmm3/m128,imm8{{efn, name=vpcom_alias
,  XOP.8 CE /r ib
, -
,  Signed 64-bit lanes
,  VPCOMQ xmm1,xmm2,xmm3/m128,imm8{{efn, name=vpcom_alias
,  XOP.8 CF /r ib
, -
,  Unsigned 8-bit lanes
,  VPCOMUB xmm1,xmm2,xmm3/m128,imm8{{efn, name=vpcom_alias
,  XOP.8 EC /r ib
, -
,  Unsigned 16-bit lanes
,  {{nowrap, VPCOMUW xmm1,xmm2,xmm3/m128,imm8{{efn, name=vpcom_alias
,  XOP.8 ED /r ib
, -
,  Unsigned 32-bit lanes
,  VPCOMUD xmm1,xmm2,xmm3/m128,imm8{{efn, name=vpcom_alias
,  XOP.8 EE /r ib
, -
,  Unsigned 64-bit lanes
,  VPCOMUQ xmm1,xmm2,xmm3/m128,imm8{{efn, name=vpcom_alias
,  XOP.8 EF /r ib
, -
! colspan=6 , 
, -
,  rowspan=12 ,  Vector Integer Horizontal Add.

For each N-bit lane, split the lane into a series of M-bit lanes, add the M-bit lanes together, then store the result into the destination as an N-bit zero/sign-extended value.
,  2x8bit -> 16bit, signed
,  VPHADDBW xmm1,xmm2/m128
,  XOP.9 C1 /r
,  rowspan=12 {{no
,  rowspan=12 {{no
, -
,  4x8bit -> 32bit, signed
,  VPHADDBD xmm1,xmm2/m128
,  XOP.9 C2 /r
, -
,  8x8bit -> 64bit, signed
,  VPHADDBQ xmm1,xmm2/m128
,  XOP.9 C3 /r
, -
,  2x16bit -> 32bit, signed
,  VPHADDWD xmm1,xmm2/m128
,  XOP.9 C6 /r
, -
,  4x16bit -> 64bit, signed
,  VPHADDWQ xmm1,xmm2/m128
,  XOP.9 C7 /r
, -
,  2x32bit -> 64bit, signed
,  VPHADDDQ xmm1,xmm2/m128
,  XOP.9 CB /r
, -
,  2x8bit -> 16bit, unsigned
,  VPHADDUBW xmm1,xmm2/m128
,  XOP.9 D1 /r
, -
,  4x8bit -> 32bit, unsigned
,  VPHADDUBD xmm1,xmm2/m128
,  XOP.9 D2 /r
, -
,  8x8bit -> 64bit, unsigned
,  VPHADDUBQ xmm1,xmm2/m128
,  XOP.9 D3 /r
, -
,  2x16bit -> 32bit, unsigned
,  VPHADDUWD xmm1,xmm2/m128
,  XOP.9 D6 /r
, -
,  4x16bit -> 64bit, unsigned
,  VPHADDUWQ xmm1,xmm2/m128
,  XOP.9 D7 /r
, -
,  {{nowrap, 2x32bit -> 64bit, unsigned
,  VPHADDUDQ xmm1,xmm2/m128
,  XOP.9 DB /r
, -
! colspan=6 , 
, -
,  rowspan=3 ,  Vector Integer Horizontal Subtract.

For each N-bit lane, split the lane into two signed sub-lanes of N/2 bits each, then subtract the upper lane from the lower lane, then store the result as a signed N-bit result.
,  2x8bit -> 16bit
,  VPHSUBBW xmm1,xmm2/m128
,  XOP.9 E1 /r
,  rowspan=3 {{no
,  rowspan=3 {{no
, -
,  2x16bit -> 32bit
,  VPHSUBWD xmm1,xmm2/m128
,  XOP.9 E2 /r
, -
,  2x32bit -> 64bit
,  VPHSUBDQ xmm1,xmm2/m128
,  XOP.9 E3 /r
, -
! colspan=6 , 
, -
,  rowspan=10 ,  Vector Signed Integer Multiply-Add.

For each N-bit lane, perform dest <- src1*src2 + src3

For src1 and src2, the factors to multiply may be taken as signed values from the low half of each lane, high half of each lane or the lane in full (picked in the same way for src1 and src2) − the addend and the result use the full lane.
,  16-bit, full-lane
,  VPMACSWW xmm1,xmm2,xmm3/m128,xmm4
,  XOP.8 95 /r /is4
,  rowspan=10 {{no
,  rowspan=10 {{no
, -
,  32-bit, low-half
,  VPMACSWD xmm1,xmm2,xmm3/m128,xmm4
,  XOP.8 96 /r /is4
, -
,  64-bit, low-half
,  VPMACSDQL xmm1,xmm2,xmm3/m128,xmm4
,  XOP.8 97 /r /is4
, -
,  32-bit, full-lane
,  VPMACSDD xmm1,xmm2,xmm3/m128,xmm4
,  XOP.8 9E /r /is4
, -
,  64-bit, high-half
,  VPMACSDQH xmm1,xmm2,xmm3/m128,xmm4
,  XOP.8 9F /r /is4
, -
,  16-bit, full-lane, saturating
,  VPMACSSWW xmm1,xmm2,xmm3/m128,xmm4
,  XOP.8 85 /r /is4
, -
,  32-bit, low-half, saturating
,  VPMACSSWD xmm1,xmm2,xmm3/m128,xmm4
,  XOP.8 86 /r /is4
, -
,  64-bit, low-half, saturating
,  VPMACSSDQL xmm1,xmm2,xmm3/m128,xmm4
,  XOP.8 87 /r /is4
, -
,  32-bit, full-lane, saturating
,  VPMACSSDD xmm1,xmm2,xmm3/m128,xmm4
,  XOP.8 8E /r /is4
, -
,  {{nowrap, 64-bit, high-half, saturating
,  {{nowrap, VPMACSSDQH xmm1,xmm2,xmm3/m128,xmm4
,  XOP.8 8F /r /is4
, -
! colspan=6 , 
, -
,  rowspan=2 ,  Packed multiply, add and accumulate signed word to signed doubleword.

For each 32-bit lane, treat src1 and src2 as 2-component vectors of signed 16-bit values, then compute their dot-product, then add src3 as a 32-bit value.
,  with saturation
,  {{nowrap, VPMADCSSWD xmm1,xmm2,xmm3/m128,xmm4
,  XOP.8 A6 /r /is4
,  rowspan=2 {{no
,  rowspan=2 {{no
, -
,  without saturation
,  VPMADCSWD xmm1,xmm2,xmm3/m128,xmm4
,  XOP.8 B6 /r /is4
, -
! colspan=6 , 
, -
,  colspan=2 ,  Packed Permute Bytes.

For VPPERM dst,src1,src2,src3, src2:src1 are considered a 32-element vector of bytes. For each byte-lane, the byte in src3 is used to index into this 32-byte vector and transform the element:
* bits 4:0 is used to pick one of the 32 bytes.
* bits 7:6 specify a transform to perform on the byte (0=keep, 1=bitreverse, 2=set-to-zero, 3=replicate-MSB)
* bit 5, if set, inverts the result after the transform.
,  VPPERM xmm1,xmm2,xmm3/m128,xmm4
,  XOP.8 A3 /r /is4
,  {{yes , ,  {{no
, -
! colspan=6 , 
, -
,  rowspan=8 ,  Packed left-rotate.

Rotation amount is given in the last source argument. It may be provided as an immediate or a vector register − in the latter case, the rotation amount is provided on a per-lane basis.
,  rowspan=2 ,  8-bit lanes
,  VPROTB xmm1,xmm2/m128,xmm3
,  XOP.9 90 /r
,  {{yes
,  rowspan=8 {{no
, -
,  VPROTB xmm1,xmm2/m128,imm8
,  XOP.8 C0 /r ib
,  {{no
, -
,  rowspan=2 ,  16-bit lanes
,  VPROTW xmm1,xmm2/m128,xmm3
,  XOP.9 91 /r
,  {{yes
, -
,  VPROTW xmm1,xmm2/m128,imm8
,  XOP.8 C1 /r ib
,  {{no
, -
,  rowspan=2 ,  32-bit lanes
,  VPROTD xmm1,xmm2/m128,xmm3
,  XOP.9 92 /r
,  {{yes
, -
,  VPROTD xmm1,xmm2/m128,imm8
,  XOP.8 C2 /r ib
,  {{no
, -
,  rowspan=2 ,  64-bit lanes
,  VPROTQ xmm1,xmm2/m128,xmm3
,  XOP.9 93 /r
,  {{yes
, -
,  VPROTQ xmm1,xmm2/m128,imm8
,  XOP.8 C3 /r ib
,  {{no
, -
! colspan=6 , 
, -
,  rowspan=8 ,  Packed shift, with signed shift-amounts.

Shift-amount is provided on a per-vector-lane basis, and is taken from the bottom 8 bits of each lane of the last source argument. The shift-amount is considered signed − a positive value will cause left-shift, while a negative value causes right-shift.
,  8-bit, signed
,  VPSHAB xmm1,xmm2/m128,xmm3
,  XOP.9 98 /r
,  rowspan=8 {{yes
,  rowspan=8 {{no
, -
,  16-bit, signed
,  VPSHAW xmm1,xmm2/m128,xmm3
,  XOP.9 99 /r
, -
,  32-bit, signed
,  VPSHAD xmm1,xmm2/m128,xmm3
,  XOP.9 9A /r
, -
,  64-bit, signed
,  VPSHAQ xmm1,xmm2/m128,xmm3
,  XOP.9 9B /r
, -
,  8-bit, unsigned
,  VPSHLB xmm1,xmm2/m128,xmm3
,  XOP.9 94 /r
, -
,  16-bit, unsigned
,  VPSHLW xmm1,xmm2/m128,xmm3
,  XOP.9 95 /r
, -
,  32-bit, unsigned
,  VPSHLD xmm1,xmm2/m128,xmm3
,  XOP.9 96 /r
, -
,  64-bit, unsigned
,  VPSHLQ xmm1,xmm2/m128,xmm3
,  XOP.9 97 /r

{{notelist
{{vpad
XOP also included two vector instructions that used the VEX prefix instead of the XOP prefix:
{,  class="wikitable"
, -
! Instruction description
! Instruction mnemonics
! Opcode
! W=1
swap
allowed
! L=1
(256b)
allowed
, -
,  Permute two-source double-precision floating-point values.
,  VPERMIL2PD ymm1,ymm2,ymm3/m256,ymm4,imm4
,  VEX.NP.0F3A 49 /r /is4
,  {{yes , ,  {{yes
, -
,  Permute two-source single-precision floating-point values.
,  VPERMIL2PS ymm1,ymm2,ymm3/m256,ymm4,imm4
,  VEX.NP.0F3A 48 /r /is4
,  {{yes , ,  {{yes

The instructions VPERMIL2PD and VPERMIL2PS were originally defined by Intel in early drafts of the AVX specification − they were removed in later drafts and were never implemented in any Intel processor. They were, however, implemented by AMD, who designated them as being a part of the XOP instruction set extension. (Like the other parts of XOP, they've been removed in AMD Zen 






Advanced Micro Devices, Inc. (AMD) is an American multinational corporation and technology company headquartered in Santa Clara, California and maintains significant operations in Austin, Texas. AMD is a Information technology, hardware and F ...
.)
{{vpad

   FMA4 instructions 

Supported in AMD processors starting with the Bulldozer 





A bulldozer or dozer (also called a crawler) is a large tractor equipped with a metal #Blade, blade at the front for pushing material (soil, sand, snow, rubble, or rock) during construction work. It travels most commonly on continuous tracks,  ...
 architecture, removed in Zen 






Zen (; from Chinese: ''Chán''; in Korean: ''Sŏn'', and Vietnamese: ''Thiền'') is a Mahayana Buddhist tradition that developed in China during the Tang dynasty by blending Indian Mahayana Buddhism, particularly Yogacara and Madhyamaka phil ...
. Not supported by any Intel chip as of 2023.

Fused multiply-add 
 Fuse or FUSE may refer to:

 Devices
* Fuse (electrical), a device used in electrical systems to protect against excessive current
** Fuse (automotive), a class of fuses for vehicles
* Fuse (hydraulic), a device used in hydraulic systems to protec ...
 with four operands. FMA4 was realized in hardware before FMA3.

{,  class="wikitable"
! Instruction !! Opcode !! Meaning !! Notes
, -
,  {{nowrap, {{mono, VFMADDPD xmm0, xmm1, xmm2, xmm3  , ,  {{nowrap, {{mono, C4E3 WvvvvL01 69 /r /is4  , ,  Fused Multiply-Add of Packed Double-Precision Floating-Point Values , , 
, -
,  {{nowrap, {{mono, VFMADDPS xmm0, xmm1, xmm2, xmm3  , ,  {{nowrap, {{mono, C4E3 WvvvvL01 68 /r /is4  , ,  Fused Multiply-Add of Packed Single-Precision Floating-Point Values , , 
, -
,  {{nowrap, {{mono, VFMADDSD xmm0, xmm1, xmm2, xmm3  , ,  {{nowrap, {{mono, C4E3 WvvvvL01 6B /r /is4  , ,  Fused Multiply-Add of Scalar Double-Precision Floating-Point Values , , 
, -
,  {{nowrap, {{mono, VFMADDSS xmm0, xmm1, xmm2, xmm3  , ,  {{nowrap, {{mono, C4E3 WvvvvL01 6A /r /is4  , ,  Fused Multiply-Add of Scalar Single-Precision Floating-Point Values , , 
, -
,  {{nowrap, {{mono, VFMADDSUBPD xmm0, xmm1, xmm2, xmm3  , ,  {{nowrap, {{mono, C4E3 WvvvvL01 5D /r /is4  , ,  Fused Multiply-Alternating Add/Subtract of Packed Double-Precision Floating-Point Values , , 
, -
,  {{nowrap, {{mono, VFMADDSUBPS xmm0, xmm1, xmm2, xmm3  , ,  {{nowrap, {{mono, C4E3 WvvvvL01 5C /r /is4  , ,  Fused Multiply-Alternating Add/Subtract of Packed Single-Precision Floating-Point Values , , 
, -
,  {{nowrap, {{mono, VFMSUBADDPD xmm0, xmm1, xmm2, xmm3  , ,  {{nowrap, {{mono, C4E3 WvvvvL01 5F /r /is4  , ,  Fused Multiply-Alternating Subtract/Add of Packed Double-Precision Floating-Point Values , , 
, -
,  {{nowrap, {{mono, VFMSUBADDPS xmm0, xmm1, xmm2, xmm3  , ,  {{nowrap, {{mono, C4E3 WvvvvL01 5E /r /is4  , ,  Fused Multiply-Alternating Subtract/Add of Packed Single-Precision Floating-Point Values , , 
, -
,  {{nowrap, {{mono, VFMSUBPD xmm0, xmm1, xmm2, xmm3  , ,  {{nowrap, {{mono, C4E3 WvvvvL01 6D /r /is4  , ,  Fused Multiply-Subtract of Packed Double-Precision Floating-Point Values , , 
, -
,  {{nowrap, {{mono, VFMSUBPS xmm0, xmm1, xmm2, xmm3  , ,  {{nowrap, {{mono, C4E3 WvvvvL01 6C /r /is4  , ,  Fused Multiply-Subtract of Packed Single-Precision Floating-Point Values , , 
, -
,  {{nowrap, {{mono, VFMSUBSD xmm0, xmm1, xmm2, xmm3  , ,  {{nowrap, {{mono, C4E3 WvvvvL01 6F /r /is4  , ,  Fused Multiply-Subtract of Scalar Double-Precision Floating-Point Values , , 
, -
,  {{nowrap, {{mono, VFMSUBSS xmm0, xmm1, xmm2, xmm3  , ,  {{nowrap, {{mono, C4E3 WvvvvL01 6E /r /is4  , ,  Fused Multiply-Subtract of Scalar Single-Precision Floating-Point Values , , 
, -
,  {{nowrap, {{mono, VFNMADDPD xmm0, xmm1, xmm2, xmm3  , ,  {{nowrap, {{mono, C4E3 WvvvvL01 79 /r /is4  , ,  Fused Negative Multiply-Add of Packed Double-Precision Floating-Point Values , , 
, -
,  {{nowrap, {{mono, VFNMADDPS xmm0, xmm1, xmm2, xmm3  , ,  {{nowrap, {{mono, C4E3 WvvvvL01 78 /r /is4  , ,  Fused Negative Multiply-Add of Packed Single-Precision Floating-Point Values , , 
, -
,  {{nowrap, {{mono, VFNMADDSD xmm0, xmm1, xmm2, xmm3  , ,  {{nowrap, {{mono, C4E3 WvvvvL01 7B /r /is4  , ,  Fused Negative Multiply-Add of Scalar Double-Precision Floating-Point Values , , 
, -
,  {{nowrap, {{mono, VFNMADDSS xmm0, xmm1, xmm2, xmm3  , ,  {{nowrap, {{mono, C4E3 WvvvvL01 7A /r /is4  , ,  Fused Negative Multiply-Add of Scalar Single-Precision Floating-Point Values , , 
, -
,  {{nowrap, {{mono, VFNMSUBPD xmm0, xmm1, xmm2, xmm3  , ,  {{nowrap, {{mono, C4E3 WvvvvL01 7D /r /is4  , ,  Fused Negative Multiply-Subtract of Packed Double-Precision Floating-Point Values , , 
, -
,  {{nowrap, {{mono, VFNMSUBPS xmm0, xmm1, xmm2, xmm3  , ,  {{nowrap, {{mono, C4E3 WvvvvL01 7C /r /is4  , ,  Fused Negative Multiply-Subtract of Packed Single-Precision Floating-Point Values , , 
, -
,  {{nowrap, {{mono, VFNMSUBSD xmm0, xmm1, xmm2, xmm3  , ,  {{nowrap, {{mono, C4E3 WvvvvL01 7F /r /is4  , ,  Fused Negative Multiply-Subtract of Scalar Double-Precision Floating-Point Values , , 
, -
,  {{nowrap, {{mono, VFNMSUBSS xmm0, xmm1, xmm2, xmm3  , ,  {{nowrap, {{mono, C4E3 WvvvvL01 7E /r /is4  , ,  Fused Negative Multiply-Subtract of Scalar Single-Precision Floating-Point Values , , 


   Trailing Bit Manipulation Instructions 


AMD introduced TBM together with BMI1 in its Piledriver 
Piledriver or pile driver may refer to:

*Pile driver, a person trained to use the diesel hammer that drives piles into the ground for foundations and bridges 
*Piledriver (professional wrestling), a move used in professional wrestling
 Entertainme ...
{{cite web, last1=Hollingsworth, first1=Brent, title=New "Bulldozer" and "Piledriver" instructions, url=http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2012/10/New-Bulldozer-and-Piledriver-Instructions.pdf, publisher=Advanced Micro Devices, Inc., access-date=11 December 2014, archive-url=https://web.archive.org/web/20140726174733/http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2012/10/New-Bulldozer-and-Piledriver-Instructions.pdf, archive-date=26 Jul 2014 line of processors; later AMD Jaguar and Zen-based processors do not support TBM.{{cite web , url=http://support.amd.com/TechDocs/52169_KB_A_Series_Mobile.pdf , title=Family 16h AMD A-Series Data Sheet , date=October 2013 , access-date=2014-01-02 , publisher=AMD 






Advanced Micro Devices, Inc. (AMD) is an American multinational corporation and technology company headquartered in Santa Clara, California and maintains significant operations in Austin, Texas. AMD is a  hardware and  fabless company that de ...
 , work=amd.com , archive-url=https://web.archive.org/web/20131107153414/http://support.amd.com/TechDocs/52169_KB_A_Series_Mobile.pdf , archive-date=7 Nov 2013  No Intel processors (as of 2023) support TBM.

The TBM instructions are all encoded using the XOP prefix. They are all available in 32-bit and 64-bit forms, selected with the XOP.W bit (0=32bit, 1=64bit). (XOP.W is ignored outside 64-bit mode.) Like all instructions encoded with VEX/XOP prefixes, they are unavailable in Real Mode and Virtual-8086 mode.

{,  class="wikitable"
, -
! Instruction
! Opcode
! Description{{cite web, url=http://support.amd.com/TechDocs/24594.pdf, title=AMD64 Architecture Programmer's Manual, Volume 3: General-Purpose and System Instructions, date=October 2013 , access-date=2014-01-02 , publisher=AMD 






Advanced Micro Devices, Inc. (AMD) is an American multinational corporation and technology company headquartered in Santa Clara, California and maintains significant operations in Austin, Texas. AMD is a  hardware and  fabless company that de ...
 , work=amd.com , archive-url=https://web.archive.org/web/20140104003442/http://support.amd.com/TechDocs/24594.pdf , archive-date=4 Jan 2014
! Equivalent C expression
, -
,  BEXTR reg,r/m,imm32
,  XOP.A 10 /r imm32
,  Bit field extract (immediate form){{efn, text=For BEXTR, a register form is available as part of BMI1.

The imm32 is interpreted as follows:
* Bit 7:0 : start position
* Bit 15:8 : length
* Bit 31:16 : ignored
,   (src >> start) & ((1 << len) − 1)
, -
,  BLCFILL reg,r/m
,  XOP.9 01 /1
,  Fill from lowest clear bit
,  x & (x + 1)
, -
,  BLCI reg,r/m
,  XOP.9 02 /6
,  Isolate lowest clear bit
,  x | ~(x + 1)
, -
,  BLCIC reg,r/m
,  XOP.9 01 /5
,  Isolate lowest clear bit and complement
,  ~x & (x + 1)
, -
,  BLCMSK reg,r/m
,  XOP.9 02 /1
,  Mask from lowest clear bit
,  x ^ (x + 1)
, -
,  BLCS reg,r/m
,  XOP.9 01 /3
,  Set lowest clear bit
,  x | (x + 1)
, -
,  BLSFILL reg,r/m
,  XOP.9 01 /2
,  Fill from lowest set bit
,  x | (x − 1)
, -
,  BLSIC reg,r/m
,  XOP.9 01 /6
,  Isolate lowest set bit and complement
,  ~x | (x − 1)
, -
,  T1MSKC reg,r/m
,  XOP.9 01 /7
,  Inverse mask from trailing ones
,  ~x | (x + 1)
, -
,  TZMSK reg,r/m
,  XOP.9 01 /4
,  Mask from trailing zeros
,  ~x & (x − 1)

{{notelist
{{vpad

  Lightweight Profiling instructions 


The AMD Lightweight Profiling (LWP) feature was introduced in AMD Bulldozer 





A bulldozer or dozer (also called a crawler) is a large tractor equipped with a metal #Blade, blade at the front for pushing material (soil, sand, snow, rubble, or rock) during construction work. It travels most commonly on continuous tracks,  ...
 and removed in AMD Zen 






Advanced Micro Devices, Inc. (AMD) is an American multinational corporation and technology company headquartered in Santa Clara, California and maintains significant operations in Austin, Texas. AMD is a Information technology, hardware and F ...
. On all supported CPUs, the latest available microcode updates have disabled LWP due to Spectre 
Spectre, specter or the spectre may refer to:
 Religion and spirituality
* Vision (spirituality)
* Apparitional experience
* Ghost
 Arts and entertainment Film and television
*  ''Spectre'' (1977 film), a made-for-television film produced and writt ...
 mitigations.

These instructions are available in Ring 3, but not available in Real Mode and Virtual-8086 mode. All of them use the XOP prefix.

{,  class="wikitable"
, -
! Instruction
! Opcode
! Description
, -
,  LLWPCB r32/64
,  XOP.9 12 /0
,  Load LWPCB (Lightweight Profiling Control Block) address.{{efn, name=lwpcb_addr, text=The address used by LLWPCB and SLWPCB is an effective-address, specified relative to the DS: segment base address. LLWPCB converts this effective-address to a linear-address by adding the DS base address to it, and SLWPCB converts it back by subtracting the DS base address. Changing the DS base address while LWP is enabled will thereby cause SLWPCB to return a different address than what was specified to LLWPCB, and may also cause XSAVE to fail to save LWP state properly.

Loading an address of 0 disables LWP. Loading a nonzero address will cause the CPU to perform validation of the specified LWPCB, then enable LWP if the validation passed. If LWP was already enabled, state for the previous LWPCB is flushed to memory.
, -
,  SLWPCB r32/64
,  XOP.9 12 /1
,  Store LWPCB address{{efn, name=lwpcb_addr to register, and flush LWP state to memory.

If LWP is not enabled, the stored address is 0.
, -
,  {{nowrap, LWPINS r32/64, r/m32, imm32
,  {{nowrap, XOP.A 12 /0 imm32
,  Insert user event record with EventID=255 in LWP ring buffer. The arguments are inserted into the event record as follows:
* The first argument is stored in bytes 23:16 (zero-extended if 32-bit)
* The second argument is stored in bytes 7:4
* The low 16 bits of the imm32 are stored in bytes 3:2 (the high 16 bits are ignored)

The LWPINS instruction sets CF=1 if LWP is enabled and the ring buffer is full, CF=0 otherwise.
, -
,  {{nowrap, LWPVAL r32/64, r/m32, imm32
,  {{nowrap, XOP.A 12 /1 imm32
,  Decrement the event counter associated with the programmed value sample event. If the resulting counter value ends up negative, insert an event record with EventID=1 in LWP ring buffer. (The instruction arguments are inserted in this record in the same way as for LWPINS.)

Executes as NOP if LWP is not enabled or if the event counter is not enabled. If no event record is inserted, then the second argument (which may be a memory argument) is not accessed.

{{notelist
{{vpad

  Instructions from other vendors 


  Instructions specific to  NEC V-series processors 

These instructions are specific to the NEC V20/V30 CPUs and their successors, and do not appear in any non-NEC CPUs. Many of their opcodes have been reassigned to other instructions in later non-NEC CPUs.
{{sticky header
{,  class="wikitable sticky-header"
, + NEC V-series instructions
! Instruction
! Opcode
! Description
! Available on
, -
! colspan="4" , 
, -
,  TEST1 r/m8, CL
TEST1 r/m16, CL
,  0F 10 /0
0F 11 /0
,  rowspan="2" ,  Test one bit. Sets FLAGS 






A flag is a piece of  fabric (most often rectangular) with distinctive colours and design. It is used as a symbol, a  signalling device, or for decoration. The term ''flag'' is also used to refer to the graphic design employed, and flags have ...
.ZF to 0 if the bit is set, 1 otherwise.
First argument specifies an 8/16-bit register or memory location to test a bit in.

Second argument specifies which bit to test.
,  rowspan="20" ,  All V-seriesNEC
16-bit V-series User's Manual
 sep 2000
Archived
on Dec 2, 2021. except V30MZ
, -
,  TEST1 r/m8, imm8
TEST1 r/m16, imm8
,  0F 18 /0 ib
0F 19 /0 ib
, -
,  CLR1 r/m8, CL
CLR1 r/m16, CL
,  0F 12 /0
0F 13 /0
,  rowspan="2" ,  Clear one bit.
, -
,  CLR1 r/m8, imm8
CLR1 r/m16, imm8
,  0F 1A /0 ib
0F 1B /0 ib
, -
,  SET1 r/m8, CL
SET1 r/m16, CL
,  0F 14 /0
0F 15 /0
,  rowspan="2" ,  Set one bit.
, -
,  SET1 r/m8, imm8
SET1 r/m16, imm8
,  0F 1C /0 ib
0F 1D /0 ib
, -
,  NOT1 r/m8, CL
NOT1 r/m16, CL
,  0F 16 /0
0F 17 /0
,  rowspan="2" ,  Invert one bit.
, -
,  NOT1 r/m8, imm8
NOT1 r/m16, imm8
,  0F 1E /0 ib
0F 1F /0 ib
, -
,  ADD4S
,  0F 20
,  Add Nibble Strings.
Performs a string addition of integers in packed BCD format (2 BCD digits per byte). DS:SI points to a source integer, ES:DI to a destination integer, and CL provides the number of digits to add. The operation is then:

destination <- destination + source
, -
,  SUB4S
,  0F 22
,  Subtract Nibble Strings.
destination <- destination − source
, -
,  CMP4S
,  0F 26
,  Compare Nibble Strings.
, -
,  ROL4 r/m8
,  0F 28 /0
,  Rotate Left Nibble.
Concatenates its 8-bit argument with the bottom 4 bits of AL to form a 12-bit bitvector, then left-rotates this bitvector by 4 bits, then writes this bitvector back to its argument and the bottom 4 bits of AL.
, -
,  ROR4 r/m8
,  0F 2A /0
,  Rotate Right Nibble. Similar to ROL4, except performs a right-rotate by 4 bits.
, -
,  EXT r8,r8
,  0F 33 /r
,  rowspan="2" ,  Bitfield extract.
Perform a bitfield read from memory. DS:SI (DS0:IX in NEC nomenclature) points to memory location to read from, first argument specifies bit-offset to read from, and second argument specifies the number of bits to read minus 1. The result is placed in AX. After the bitfield read, SI and the first argument are updated to point just beyond the just-read bitfield.
, -
,  EXT r8,imm8
,  0F 3B /0 ib
, -
,  INS r8,r8
,  0F 31 /r
,  rowspan="2" ,  Bitfield Insert.
Perform a bitfield write to memory. ES:DI (DS1:IY in NEC nomenclature) points to memory location to write to, AX contains data to write, first argument specifies bit-offset to write to, and second argument specifies the number of bits to write minus 1. After the bitfield write, DI and the first argument are updated to point just beyond the just-written bitfield.
, -
,  INS r8,imm8
,  0F 39 /0 ib
, -
,  REPC
,  64
,  Repeat if carry. Instruction prefix for use with CMPS/SCAS.
, -
,  REPNC
,  65
,  Repeat if not carry. Instruction prefix for use with CMPS/SCAS.
, -
,  FPO2
,  66 /r
67 /r
,  "Floating Point Operation 2": extra escape opcodes for floating-point coprocessor, in addition to the standard D8-DF ones used for x87 

x87 is a floating-point-related subset of the x86 architecture instruction set. It originated as an extension of the 8086 instruction set in the form of optional floating-point  coprocessors that work in tandem with corresponding x86 CPUs. These m ...
.

The FPO2 escape opcodes are used by the NEC 72291 floating-point coprocessor - this coprocessor also uses the standard D8-DF escape opcodes, but uses them to encode an instruction set that is unique to the 72291 and not compatible with x87. A listing of the opcodes/instructions supported by the 72291 is available.
, -
! colspan="4" , 
, -
,  BRKEM imm8
,  0F FF ib
,  Break to 8080 




The Intel 8080 is Intel's second 8-bit microprocessor. Introduced in April 1974, the 8080 was an enhanced successor to the earlier  Intel 8008 microprocessor, although without  binary compatibility.'' Electronic News'' was a weekly trade newspa ...
 emulation mode.{{efn, text=The Intel 8080 emulation mode of NEC V20/V30/V40/V50 supports the following NEC-specific instructions in addition to the basic 8080 instruction set:
{{(! class="wikitable sortable"
! Instruction !! Opcode !! Description
{{!-
{{! CALLN imm8 {{!! {{nowrap, ED ED ib {{!! Call to native mode
{{!-
{{! RETEM {{!! ED FD {{!! Return from 8080 emulation mode
{{!)

Jump to an address picked from the IVT (Interrupt Vector Table) using the imm8 argument, similar to the 8086 INT instruction, but start executing as Intel 8080 




The Intel 8080 is Intel's second 8-bit computing, 8-bit microprocessor. Introduced in April 1974, the 8080 was an enhanced successor to the earlier Intel 8008 microprocessor, although without binary compatibility.'' Electronic News'' was a week ...
 code rather than x86 code.
,  V20, V30, V40, V50
, -
! colspan="4" , 
, -
,  BRKXA imm8
,  0F E0 ib
,  Break to Extended Address Mode.
Jump to an address picked from the IVT using the imm8 argument. Enables a simple memory paging mechanism after reading the IVT but before executing the jump.
The paging mechanism uses an on-chip page table with 16Kbyte pages and no access rights checking. 
,  rowspan="2" ,  V33, V53
, -
,  RETXA imm8
,  0F F0 ib
,  Return from Extended Address Mode.
Jump to an address picked from the IVT using the imm8 argument. Disables paging after reading the IVT but before executing the jump.
, -
! colspan="4" , 
, -
,  MOVSPA
,  0F 25
,  Transfer both SS and SP of old register bank after the bank has been switched by an interrupt or BRKCS instruction.
,  rowspan="8" ,  V25, V35, V55
, -
,  BRKCS r16
,  0F 2D /0
,  Perform software interrupt with context switch to register bank specified by low 3 bits of r16.
, -
,  RETRBI
,  0F 91
,  Return from register bank context switch interrupt.
, -
,  FINT
,  0F 92
,  Finish Interrupt.
, -
,  TSKSW r16
,  0F 94 /7
,  Perform task switch to register bank indicated by low 3 bits of r16.
, -
,  MOVSPB r16
,  0F 95 /7
,  Transfer SS and SP of current register bank to register bank indicated by low 3 bits of r16.
, -
,  {{nowrap, BTCLR imm8,imm8,cb
,  {{nowrap, 0F 9C ib ib rel8
,  Bit Test and Clear.
The first argument specifies a V25/V35 Special Function Register to test a bit in. The second argument specifies a bit position in that register. The third argument specifies a short branch offset. If the bit was set to 1, then it is cleared and a short branch is taken, else the branch is not taken.
, -
,  STOP
,  0F 9E
,  CPU Halt.

Differs from the conventional 8086 HLT instruction in that the clock is stopped too, so that an NMI or CPU reset is needed to resume operation.
, -
! colspan="4" , 
, -
,  BRKS imm8
,  F1 ib
,  Break and Enable Software Guard.
Jump to an address picked from the IVT using the imm8 argument, and then continue execution with "Software Guard" enabled. The "Software Guard" is an 8-bit Substitution cipher 


In cryptography, a substitution cipher is a method of  encrypting in which units of  plaintext are replaced with the ciphertext, in a defined manner, with the help of a key; the "units" may be single letters (the most common), pairs of letters, t ...
 that, during instruction fetch/decode, translates opcode bytes using a 256-entry lookup table stored in an on-chip Mask ROM 





Read-only memory (ROM) is a type of non-volatile memory used in computers and other electronic devices. Data stored in ROM cannot be electronically modified after the manufacture of the  memory device. Read-only memory is useful for storing so ...
.
,  rowspan="2" ,  V25, V35 "Software Guard"
, -
,  BRKN imm8
,  63 ib
,  Break and Enable Native Mode. Similar to BRKS, excepts disables "Software Guard" rather than enabling it.
, -
! colspan="4" , 
, -
,  MOV r/m,DS3
,  8C /6
,  rowspan="8" ,  Move to/from the DS2 and DS3 extended segment registers.

The DS2 and DS3 registers (which are specific to the NEC V55) act similar to regular x86 real mode 




Real mode, also called real address mode, is an operating mode of all x86-compatible CPUs. The mode gets its name from the fact that addresses in real mode always correspond to real locations in memory. Real mode is characterized by a 20- bit s ...
 segment registers except that they are left-shifted by 8 rather than 4, enabling access to 16MB of memory. Block transfer instructions, such as MOVBKW, can access the 16MB memory space by simultaneously prefixing with DS2 and DS3.
,  rowspan="19" ,  V55Renesas
NEC V55PI 16-bit microprocessor Data Sheet, U11775EArchived
on Jul 27, 2023.
, -
,  MOV r/m,DS2
,  8C /7
, -
,  MOV DS3,r/m
,  8E /6
, -
,  MOV DS2,r/m
,  8E /7
, -
,  PUSH DS3
,  0F 76
, -
,  POP DS3
,  0F 77
, -
,  PUSH DS2
,  0F 7E
, -
,  POP DS2
,  0F 7F
, -
,  {{nowrap, MOV DS3,r16,m32
,  0F 36 /r
,  rowspan="2" ,  Instructions to load both extended segment register and general-purpose register at once, similar to 8086's LDS and LES instructions
, -
,  {{nowrap, MOV DS2,r16,m32
,  0F 3E /r
, -
,  DS2:
,  63
,  rowspan="2" ,  Segment-override prefixes for the DS2 and DS3 extended segments.

When used with string instructions such as MOVBKW, the DS2: prefix overrides the DS segment, while the DS3: prefix overrides the ES segment. 
, -
,  DS3:
,  D6
, -
,  IRAM:
,  F1
,  Register File Override Prefix. Will cause memory operands to index into register file rather than general memory.
, -
,  BSCH r/m8
BSCH r/m16
,  0F 3C /0
0F 3D /0
,  Count Trailing Zeroes and store result in CL. Sets ZF=1 for all-0s input.
, -
,  {{nowrap, RSTWDT imm8,imm8
,  0F 96 ib ib
,  Watchdog Timer Manipulation Instruction.
, -
,  {{nowrap, BTCLRL imm8,imm8,cb
,  {{nowrap, 0F 9D ib ib rel8
,  Bit test and clear for second bank of special purpose registers (similar to BTCLR).
, -
,  QHOUT imm16
,  0F E0 iw
,  rowspan="3" ,  Queue manipulation instructions.
, -
,  QOUT imm16
,  0F E1 iw
, -
,  QTIN imm16
,  0F E2 iw
, -
! colspan="4" , 
, -
,  IDLE
,  0F 9F
,  Put CPU in idle mode.
,  V55SCNEC V55SC 16-bit Microprocessor Preliminary Data Sheet (O.D.No ID-8206A, March 1993), pages 70 and 127. Located on Apr 20, 2022 by searching fo
"nec v55sc"
at datasheetarchive.com
Archived
on Nov 22, 2022.
, -
! colspan="4" , 
, -
,  ALBIT
,  0F 9A
,  rowspan="9" ,  Dedicated fax instructions.
,  rowspan="9" ,  V55PI
, -
,  COLTRP
,  0F 9B
, -
,  MHENC
,  0F 93
, -
,  MRENC
,  0F 97
, -
,  SCHEOL
,  0F 78
, -
,  GETBIT
,  0F 79
, -
,  MHDEC
,  0F 7C
, -
,  MRDEC
,  0F 7D
, -
,  CNVTRP
,  0F 7A
, -
! colspan="4" , 
, -
,  (no mnemonic)
,  63
,  Designated opcode for termination of the x86 emulation mode on the NEC V60 


The NEC V60 is a  CISC microprocessor manufactured by NEC starting in 1986. Several improved versions were introduced with the same instruction set architecture (ISA), the V70 in 1987, and the V80 and AFPP in 1989. They were succeeded by the  V80 ...
.
,  V60, V70

{{notelist
{{vpad

  Instructions specific to 
Cyrix 




Cyrix Corporation was a microprocessor developer that was founded in 1988 in  Richardson, Texas, as a specialist supplier of  floating point units for  286 and  386 microprocessors. The company was founded by Tom Brightman and Jerry Rogers. Ter ...
 and Geode 




A geode (; ) is a geology, geological secondary formation within sedimentary rock, sedimentary and volcanic rocks. Geodes are hollow, vaguely spherical rocks, in which masses of mineral matter (which may include crystals) are secluded. The crys ...
 CPUs 


These instructions are present in Cyrix CPUs as well as NatSemi/AMD Geode CPUs derived from Cyrix microarchitectures (Geode GX and LX, but not NX). They are also present in  Cyrix manufacturing partner CPUs from IBM, ST and TI, as well as the VIA Cyrix III 


Cyrix III is an x86-compatible Socket 370 CPU. VIA Technologies launched the processor in February 2000. VIA had purchased both  Centaur Technology and Cyrix. Cyrix III was to be based upon a core from one of the two companies.
  History 
The Cyr ...
 ("Joshua" core only, not "Samuel") and a few SoCs such as STPC ATLAS and ZFMicro ZFx86. Many of these opcodes have been reassigned to other instructions in later non-Cyrix CPUs.

{,  class="wikitable"
! Instruction
! Opcode
! Description
! Available on
, -
,  {{nowrap, SVDC m80,sreg
,  0F 78 /r
,  Save segment register and descriptor to memory as a 10-byte data structure.

The first 8 bytes are the descriptor, the last two bytes are the selector.
,  rowspan="6" ,  System Management Mode instructions.{{efn, The Cyrix SMM instructions also include RSM ({{nowrap, 0F AA; Return from System Management mode), however, RSM is not a Cyrix-specific instruction, and it continues to exist in modern non-Cyrix x86 processors.  
Not present on stepping A of Cx486SLC and Cx486DLC.Debbie Wiles
CPU identification
 archived on 2004-06-04

Present on Cx486SLC/eCyrix 486SLC/e Data Sheet (1992)
 section 2.6.4 and all later Cyrix CPUs.

Present on all Cyrix-derived Geode CPUs.
, -
,  {{nowrap, RSDC sreg,m80{{efn, RSDC with CS as a destination register is only supported on NatSemi Geode GX2 and AMD Geode GX/LX - on other processors, it causes #UD. 
,  0F 79 /r
,  Restore segment register and descriptor from memory
, -
,  SVLDT m80
,  {{nowrap, 0F 7A /0
,  Save LDTR and descriptor
, -
,  RSLDT m80
,  0F 7B /0
,  Restore LDTR and descriptor
, -
,  SVTS m80
,  0F 7C /0
,  Save TSR and descriptor
, -
,  RSTS m80
,  0F 7D /0
,  Restore TSR and descriptor
, -
,  rowspan="2" ,  SMINT{{efn, Some assemblers/disassemblers, such as  NASM, use the instruction mnemonic SMINTOLD for the 0F 7E encoding.
,  0F 7E
,  rowspan="2" ,  System management software interrupt.
Uses {{nowrap, 0F 7E encoding on Cyrix 486, 5x86, 6x86 and ZFx86.

Uses {{nowrap, 0F 38 encoding on Cyrix 6x86MX, MII, MediaGX and Geode.
,  rowspan="2" ,  Cyrix 486S and later processors - not available on older Cyrix 486SLC/DLC/SRx2/DRx2 processors.

Not available on any Ti486 processors.
, -
,  0F 38
, -
,  RDSHR r/m32
,  0F 36 /0{{efn, name=cyrix_rdshr_arg, text=For the RDSHR and WRSHR instructions, Cyrix's documentation specifies that the instruction accepts a ModR/M 
The ModR/M byte is an important part of  instruction encoding for the x86 instruction set.
 Description
Opcodes in x86 are generally one-byte, though two-byte instructions and prefixes exist. ModR/M is a byte that, if required, follows the opcode a ...
 byte but does not specify the encoding of the ModR/M byte's reg field.  NASM v0.98.31 and later uses /0 for these instructions, while sandpile.org's opcode tables indicate that the reg field is ignored for these instructions.
,  Read SMM Header Pointer Register
,  rowspan="2" ,  Cyrix 6x86MXCyrix 6x86MX Data Book
 section 2.15.3 and MII

VIA Cyrix III
, -
,  WRSHR r/m32
,  0F 37 /0{{efn, name=cyrix_rdshr_arg
,  Write SMM Header Pointer Register
, -
,  BB0_RESET
,  0F 3A
,  Reset BLT Buffer Pointer 0 to base
,  rowspan="4" ,  Cyrix MediaGX and MediaGXm

NatSemi Geode GXm, GXLV, GX1
, -
,  BB1_RESET
,  0F 3B
,  Reset BLT Buffer Pointer 1 to base
, -
,  CPU_WRITE
,  0F 3C
,  Write to CPU internal special register (EBX=register-index, EAX=data)
, -
,  CPU_READ
,  0F 3D
,  Read from CPU internal special register (EBX=register-index, EAX=data)
, -
,  DMINT
,  0F 39
,  Debug Management Mode Interrupt
,  rowspan="2" ,  NatSemi Geode GX2

AMD Geode GX, LXAMD
Geode LX Processors Data Book
 Feb 2009, publication ID 33234H, section 8.3.4, pages 643-657
Archived
on 3 Dec 2023.
, -
,  RDM
,  0F 3A
,  Return from Debug Management Mode

{{notelist

  Cyrix  EMMI instructions 


These instructions were introduced in the Cyrix 6x86MX 


The Cyrix 6x86 is a line of sixth-generation, 32-bit x86 microprocessors designed and released by Cyrix in 1995. Cyrix, being a fabless company, had the chips manufactured by IBM and SGS-Thomson. The 6x86 was made as a direct competitor to  Intel ...
 and MII processors, and were also present in the MediaGX 



The MediaGX  CPU is an x86-compatible processor that was designed by Cyrix and manufactured by National Semiconductor following the two companies' merger. It was introduced in 1997. The core is based on the integration of the Cyrix Cx5x86 CPU co ...
m and Geode GX1 processors. (In later non-Cyrix processors, all of their opcodes have been used for SSE or SSE2 instructions.)

These instructions are integer SIMD instructions acting on 64-bit vectors in MMX registers or memory. Each instruction takes two explicit operands, where the first one is an MMX register operand and the second one is either a memory operand or a second MMX register. In addition, several of the instructions take an implied operand, which is an MMX register implied from the first operand as follows:
{,  class="wikitable"
, -
! First explicit operand
,  mm0 , ,  mm1 , ,  mm2 , ,  mm3 , ,  mm4 , ,  mm5 , ,  mm6 , ,  mm7
, -
! Implied operand
,  mm1 , ,  mm0 , ,  mm3 , ,  mm2 , ,  mm5 , ,  mm4 , ,  mm7 , ,  mm6

In the instruction descriptions in the below table, arg1 and arg2 refer to the two explicit operands of the instruction, and imp to the implied operand.  
{,  class="wikitable"
, -
! Instruction !! Opcode !! colspan=2 ,  Description
, -
,  PAVEB mm,mm/m64 , ,  0F 50 /r , ,  colspan=2 ,  Packed average bytes:{{efn, Implementations differ on whether the PAVEB instruction treats the bytes as signed or unsigned.
arg1 <- (arg1+arg2) >> 1
, -
,  {{nowrap, PADDSIW mm,mm/m64 , ,  0F 51 /r , ,  colspan=2 ,  Packed add signed words with saturation, using implied destination:
imp <- saturate_s16(arg1+arg2)
, -
,  PMAGW mm,mm/m64 , ,  0F 52 /r , ,  colspan=2 ,  Packed signed word magnitude maximum value:
if (abs(arg2) > abs(arg1)) then arg1 <- arg2
, -
,  PDISTIB mm,m64{{efn, name=emmi_memop, text=For PDISTIB, PMACHRIW and the PMV* instructions, the second explicit operand is required to be a memory operand − register operands are not supported. , ,  {{nowrap, 0F 54 /r , ,  colspan=2 ,  Packed unsigned byte distance and accumulate to implied destination, with saturation:
imp <- saturate_u8(imp + (abs(arg1-arg2)))
, -
,  {{nowrap, PSUBSIW mm,mm/m64 , ,  0F 55 /r , ,  colspan=2 ,  Packed subtract signed words with saturation, using implied destination:
imp <- saturate_s16(arg1-arg2)
, -
,  {{nowrap, PMULHRW mm,mm/m64,{{efn, The Cyrix EMMI PMULHRW instruction has the same mnemonic as the 3DNow! PMULHRW instruction, however its opcode and function differ (the EMMI instruction right-shifts its multiply-result by 15 bits, while the 3DNow! instruction right-shifts by 16 bits).Some assemblers/disassemblers, such as NASM, resolve this ambiguity by using the mnemonic PMULHRWA for the 3DNow! instruction and PMULHRWC for the EMMI instruction.
 
{{nowrap, 1=PMULHRWC mm,mm/m64, ,  0F 59 /r , ,  colspan=2 ,  Packed signed word multiply high with rounding:
arg1 <- (arg1*arg2+0x4000)>>15
, -
,  {{nowrap, PMULHRIW mm,mm/m64 , ,  0F 5D /r , ,  colspan=2 ,  Packed signed word multiply high with rounding and implied destination:
imp <- (arg1*arg2+0x4000)>>15
, -
,  PMACHRIW mm,m64{{efn, name=emmi_memop , ,  0F 5E /r , ,  colspan=2 ,  Packed signed word multiply high with rounding and accumulation to implied destination:
imp <- imp + ((arg1*arg2+0x4000)>>15)
, -
! colspan=4 , 
, -
,  PMVZB mm,m64{{efn, name=emmi_memop , ,  0F 58 /r , ,  {{nowrap, 1=if (imp 
 0) then arg1 <- arg2
,  rowspan=4 ,  Packed conditional load from memory to MMX register.

Condition is evaluated on a per-byte-lane basis, by comparing byte lanes in the implied source to zero (with signed compare) − if the comparison passes, then the corresponding destination lane is loaded from memory, otherwise it keeps its original value.
, -
,  PMVNZB mm,m64{{efn, name=emmi_memop , ,  0F 5A /r , ,  {{nowrap, 1=if (imp != 0) then arg1 <- arg2
, -
,  PMVLZB mm,m64{{efn, name=emmi_memop , ,  0F 5B /r , ,  {{nowrap, 1=if (imp <  0) then arg1 <- arg2
, -
,  PMVGEZB mm,m64{{efn, name=emmi_memop , ,  0F 5C /r , ,  {{nowrap, 1=if (imp >= 0) then arg1 <- arg2

{{notelist
{{vpad

  Instructions specific to 
VIA Technologies 




VIA Technologies, Inc. () is a Taiwanese manufacturer of integrated circuits, mainly motherboard chipsets, CPUs, and memory. It was once the world's largest independent manufacturer of motherboard chipsets. As a  fabless semiconductor company,  ...
 CPUs 

All VIA C3 


The VIA C3 is a family of x86 central processing units for personal computers designed by  Centaur Technology and sold by VIA Technologies. The different CPU cores are built following the  design methodology of Centaur Technology.

In addition to ...
 processors support the VIA AIS (Alternate Instruction Set 

The Alternate Instruction Set (AIS) is a second 32-bit instruction set architecture found in some x86 CPUs made by VIA Technologies.  On these VIA C3 processors, the second hidden processor mode is accessed by executing the x86 instruction JMPAI ( ...
). The x86 instructions present in these processors to support AIS are:
{,  class="wikitable"
! Instruction !! Opcode !! Description
, -
,  {{nowrap, JMPAI EAX , ,  0F 3FVIA Technologies
VIA C3 Samuel 2 Processor Datasheet
 version 1.10, January 2002 - publicly available datasheet that lists the 0F 3F and {{nowrap, 8D 84 00 imm32 AIS opcodes (without mnemonics) on page 60. Archived from th
original
on 10 Apr 2004. , ,  Near Jump to address in EAX, and enter Alternate Instruction mode.
, -
,  rowspan=2 ,  {{nowrap, AI uop32
,  {{nowrap, 8D 84 00 imm32 , ,  Alternate instruction wrapper opcode ("Samuel"/"Ezra" variants of C3 - repurposes the instruction encoding for {{nowrap, LEA EAX, AX+EAX+disp32/code>)

32-bit immediate is treated as a 32-bit instruction of the RISC-like Alternate Instruction Set. An instruction set reference is available.VIA
VIA C3 Processor Alternate Instruction Set Programming Reference
 version 0.25, november 2002. Accessed on Apr 26, 2023.
, -
,  62 80 imm32VIA
VIA C3 Processor Alternate Instruction Set Application Note
 version 0.24, 2002, page 14. Accessed on Apr 26, 2023. , ,  Alternate instruction wrapper opcode ("Nehemiah" variants of C3 - repurposes the instruction encoding for {{nowrap, BOUND EAX, AX+disp32/code>)

These instructions are not present in VIA C7 

The VIA C7 is an x86 central processing unit designed by  Centaur Technology and sold by VIA Technologies.
  Product history 
The C7 delivers a number of improvements to the older  VIA C3 cores but is nearly identical to the latest VIA C3 Nehemiah ...
 or any later VIA processor. 
{{vpad

  Instructions specific to 
Chips and Technologies 



Chips and Technologies, Inc. (C&T), was an early fabless semiconductor company founded in Milpitas, California, in December 1984 by Gordon A. Campbell and Dado Banatao.

Its first product, announced September 1985, was a four chip Enhanced Graph ...
 CPUs 

The C&T F8680 PC/Chip is a system-on-a-chip featuring an 80186-compatible CPU core, with a few additional instructions to support the F8680-specific "SuperState R" supervisor/system-management feature. Some of the added instructions for "SuperState R" are:

{,  class="wikitable"
! Instruction !! Opcode !! Description
, -
,  LFEAT AX , ,  FE F8 , ,  Load datum into F8680 "CREG" configuration register (AH=register-index, AL=datum)
, -
,  STFEAT AL,imm8 , ,  {{nowrap, FE F0 ib , ,  Read F8680 status register into AL (imm8=register-index)


C&T also developed a 386-compatible processor known as the Super386. This processor supports, in addition to the basic Intel 386 instruction set, a number of instructions to support the Super386-specific {{nowrap, "SuperState V" system-management feature. The added instructions for {{nowrap, "SuperState V" are:Microprocessor Report
System Management Mode Explained
(vol 6, no. 8, june 17, 1992) − includes a listing of the AMD/Cyrix SMM opcodes and the C&T Super386 "SuperState V" opcodes
Archived
on 29 Jun 2022.

{,  class="wikitable"
! Instruction !! Opcode !! Description
, -
,  {{nowrap, SCALL r/m , ,  {{nowrap, 0F 18 /0 , ,  Call SMM interrupt handler
, -
,  SRET , ,  0F 19 , ,  Return from SMM interrupt handler
, -
,  SRESUME , ,  0F 1A , ,  Return from SMM with interrupts disabled for one instruction
, -
,  SVECTOR , ,  0F 1B , ,  Exit from SMM and issue a shutdown cycle
, -
,  EPIC , ,  0F 1E , ,  Load one of the six interrupt or I/O traps
, -
,  RARF1 , ,  0F 3C , ,  Read from bank 1 of the register file (includes visible and invisible CPU registers)
, -
,  RARF2 , ,  0F 3D , ,  Read from bank 2 of the register file
, -
,  RARF3 , ,  0F 3E , ,  Read from bank 3 of the register file
, -
,  LTLB , ,  0F F0 , ,  Load TLB with page table entry
, -
,  RCT , ,  0F F1 , ,  Read cache tag
, -
,  WCT , ,  0F F2 , ,  Write cache tag
, -
,  RCD , ,  0F F3 , ,  Read cache data
, -
,  WCD , ,  0F F4 , ,  Write cache data
, -
,  RTLBPA , ,  0F F5 , ,  Read TLB data (physical address)
, -
,  RTLBLA , ,  0F F6 , ,  Read TLB tag (linear address)
, -
,  LCFG , ,  0F F7 , ,  Load configuration register
, -
,  SCFG , ,  0F F8 , ,  Store configuration register
, -
,  RGPR , ,  0F F9 , ,  Read general-purpose register or any bank of register file
, -
,  RARF0 , ,  0F FA , ,  Read from bank 0 of the register file
, -
,  RARFE , ,  0F FB , ,  Read from extra bank of the register file
, -
,  WGPR , ,  0F FD , ,  Write general-purpose register or any bank of register file
, -
,  WARFE , ,  0F FE , ,  Write extra bank of the register file


  Instructions specific to 
ALi 






Ali ibn Abi Talib (; ) was the fourth Rashidun caliph who ruled from  until  his assassination in 661, as well as the first  Shia Imam. He was the cousin and son-in-law of the Islamic prophet Muhammad. Born to Abu Talib ibn Abd al-Muttalib an ...
/Nvidia 




Nvidia Corporation ( ) is an American multinational corporation and technology company headquartered in Santa Clara, California, and  incorporated in Delaware. Founded in 1993 by  Jensen Huang (president and CEO),  Chris Malachowsky, and  Curti ...
/ DM&P M6117 MCUs 

The  M6117 series of embedded microcontrollers feature an Intel 386SX compatible CPU core derived from V.M. Technology (VMT) VM386SX+ processor. VMT VM386SX+ adds a few processor specific additions to the Intel 386 instruction set. The ones documented for DM&P M6117D are:
{,  class="wikitable"
! Instruction !! Opcode !! Description
, -
,  BRKPM , ,  F1 , ,  System management interrupt − enters "hyper state mode"
, -
,  RETPM , ,  D6 E6 , ,  Return from "hyper state mode"
, -
,  {{nowrap, LDUSR UGRS,EAX , ,  {{nowrap, D6 CA 03 A0 , ,  Set page address of SMI entry point
, -
,  (mnemonic not listed) , ,  {{nowrap, D6 C8 03 A0 , ,  Read page address of SMI entry point
, -
,  MOV PWRCR,EAX , ,  D6 FA 03 02 , ,  Write to power control register


  Instructions present in specific 
80387 

x87 is a floating-point-related subset of the x86 architecture instruction set. It originated as an extension of the 8086 instruction set in the form of optional floating-point  coprocessors that work in tandem with corresponding x86 CPUs. These m ...
 clones 

Several 80387-class floating-point coprocessors provided extra instructions in addition to the standard 80387 ones − none of these are supported in later processors:
{,  class="wikitable"
! Instruction
! Opcode
! Description
! Available on
, -
,  FRSTPM
,  {{nowrap, DB F4
or

{{nowrap, DB E5
,  FPU Reset Protected Mode.
Instruction to signal to the FPU that the main CPU is exiting protected mode, similar to how the FSETPM instruction is used to signal to the FPU that the CPU is entering protected mode.

Different sources provide different encodings for this instruction.
,  Intel 287XL
, -
,  {{nowrap, FNSTDW AX
,  DF E1
,  Store FPU Device Word to AX
,  rowspan="2" ,  Intel 387SL
, -
,  {{nowrap, FNSTSG AX
,  DF E2
,  Store FPU Signature Register to AX{{efn, The FNSTSG AX instruction can be executed not just on the Intel 387SL FPU but on the Intel 387SX as well - executing the instruction immediately after an FNINIT will cause the instruction to return 0000h on 387SX, but a nonzero signature value on the 387SL.Desmond Yuen
Intel's SL Architecture: Designing Portable Applications
 (1993, ISBN 0-07-911336-2) p.127
, -
,  FSBP0
,  DB E8
,  Select Coprocessor Register Bank 0
,  rowspan="5" ,  IIT 2c87, 3c87IIT 3c87 Advanced Math CoProcessor Data Book
/ref>
, -
,  FSBP1
,  DB EB
,  Select Coprocessor Register Bank 1
, -
,  FSBP2
,  DB EA
,  Select Coprocessor Register Bank 2
, -
,  FSBP3
,  {{nowrap, DB E9Harald Feldmann

/ref>
,  Select Coprocessor Register Bank 3 (undocumented)
, -
,  F4X4,
FMUL4X4
,  DB F1
,  Multiply 4-component vector with 4x4 matrix. For proper operation, the matrix must be preloaded into Coprocessor Register banks 1 and 2 (unique to IIT FPUs), and the vector must be loaded into Coprocessor Register Bank 0. Example code is available.Norbert Juff
"Everything You Always Wanted To Know About Math Coprocessors"
 01-oct-94 revision
, -
,  FTSTP
,  D9 E6
,  Equivalent to FTST followed by a stack pop.
,  rowspan="4" ,  Cyrix EMC87, 83s87, 83d87, 387+Robert L. Hummel
PC Magazine Programmer's Technical Reference
 1992, {{ISBN, 1-56276-016-5, pages 670-672 and 710. 
, -
,  FRINT2
,  DB FC
,  Round st(0) to integer, with round-to-nearest ties-away-from-zero rounding.
, -
,  FRICHOP
,  DD FC
,  Round st(0) to integer, with round-to-zero rounding.
, -
,  FRINEAR
,  DF FC
,  Round st(0) to integer, with round-to-nearest-even rounding.

{{notelist

  See also 

* x86 instruction listings 


The x86 instruction set refers to the set of instructions that x86-compatible microprocessors support. The instructions are usually part of an executable program, often stored as a computer file and executed on the processor.

The x86 instruction ...


  References 

{{reflist

{{x86 assembly topics

  
 Instruction set listings

Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, and Delaware General Corporation Law, incorporated in Delaware. Intel designs, manufactures, and sells computer compo ... instructions

i386 The Intel 386, originally released as the 80386 and later renamed i386, is the third-generation x86 architecture microprocessor from Intel. It was the first 32-bit processor in the line, making it a significant evolution in the x86 archite ... instructions

Itanium Itanium (; ) is a discontinued family of 64-bit computing, 64-bit Intel microprocessors that implement the Intel Itanium architecture (formerly called IA-64). The Itanium architecture originated at Hewlett-Packard (HP), and was later jointly dev ... instructions

MPX instructions

Hardware Lock Elision

VP2Intersect instructions

Instructions specific to Xeon Phi processors

"Knights Corner" instructions

"Knights Landing" and "Knights Mill" instructions

AMD Advanced Micro Devices, Inc. (AMD) is an American multinational corporation and technology company headquartered in Santa Clara, California and maintains significant operations in Austin, Texas. AMD is a hardware and fabless company that de ... instructions

Am386 The Am386 CPU is a 100%-compatible clone of the Intel 80386 design released by AMD in March 1991. It sold millions of units, positioning AMD as a legitimate competitor to Intel, rather than being merely a second source for ''x86'' CPUs (then te ... SMM instructions

3DNow! instructions

3DNow+ instructions added with Athlon AMD Athlon is the brand name applied to a series of x86, x86-compatible microprocessors designed and manufactured by AMD, Advanced Micro Devices. The original Athlon (now called Athlon Classic) was the first seventh-generation x86 processor a ... and K6-2+

3DNow! instructions specific to Geode GX and LX

SSE5 The SSE5 (short for Streaming SIMD Extensions version 5) was a SIMD instruction set extension proposed by AMD on August 30, 2007 as a supplement to the 128-bit SSE core instructions in the AMD64 architecture. AMD chose not to implement SSE5 as ori ... derived instructions

XOP instructions

FMA4 instructions

Trailing Bit Manipulation Instructions

Lightweight Profiling instructions

Instructions from other vendors

Instructions specific to NEC V-series processors

Cyrix EMMI instructions

Instructions specific to VIA Technologies VIA Technologies, Inc. () is a Taiwanese manufacturer of integrated circuits, mainly motherboard chipsets, CPUs, and memory. It was once the world's largest independent manufacturer of motherboard chipsets. As a fabless semiconductor company, ... CPUs

Instructions present in specific 80387 x87 is a floating-point-related subset of the x86 architecture instruction set. It originated as an extension of the 8086 instruction set in the form of optional floating-point coprocessors that work in tandem with corresponding x86 CPUs. These m ... clones

See also

References

Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, and Delaware General Corporation Law, incorporated in Delaware. Intel designs, manufactures, and sells computer compo ...
instructions

i386 The Intel 386, originally released as the 80386 and later renamed i386, is the third-generation x86 architecture microprocessor from Intel. It was the first 32-bit processor in the line, making it a significant evolution in the x86 archite ...
instructions

Itanium Itanium (; ) is a discontinued family of 64-bit computing, 64-bit Intel microprocessors that implement the Intel Itanium architecture (formerly called IA-64). The Itanium architecture originated at Hewlett-Packard (HP), and was later jointly dev ...
instructions

AMD Advanced Micro Devices, Inc. (AMD) is an American multinational corporation and technology company headquartered in Santa Clara, California and maintains significant operations in Austin, Texas. AMD is a hardware and fabless company that de ...
instructions

Am386 The Am386 CPU is a 100%-compatible clone of the Intel 80386 design released by AMD in March 1991. It sold millions of units, positioning AMD as a legitimate competitor to Intel, rather than being merely a second source for ''x86'' CPUs (then te ...
SMM instructions

3DNow+ instructions added with
Athlon AMD Athlon is the brand name applied to a series of x86, x86-compatible microprocessors designed and manufactured by AMD, Advanced Micro Devices. The original Athlon (now called Athlon Classic) was the first seventh-generation x86 processor a ...
and K6-2+

SSE5 The SSE5 (short for Streaming SIMD Extensions version 5) was a SIMD instruction set extension proposed by AMD on August 30, 2007 as a supplement to the 128-bit SSE core instructions in the AMD64 architecture. AMD chose not to implement SSE5 as ori ...
derived instructions

Instructions specific to
VIA Technologies VIA Technologies, Inc. () is a Taiwanese manufacturer of integrated circuits, mainly motherboard chipsets, CPUs, and memory. It was once the world's largest independent manufacturer of motherboard chipsets. As a fabless semiconductor company, ...
CPUs

Instructions present in specific
80387 x87 is a floating-point-related subset of the x86 architecture instruction set. It originated as an extension of the 8086 instruction set in the form of optional floating-point coprocessors that work in tandem with corresponding x86 CPUs. These m ...
clones