X86 assembly language
   HOME

TheInfoList



OR:

x86 assembly language is the name for the family of
assembly language In computer programming, assembly language (or assembler language, or symbolic machine code), often referred to simply as Assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence b ...
s which provide some level of
backward compatibility Backward compatibility (sometimes known as backwards compatibility) is a property of an operating system, product, or technology that allows for interoperability with an older legacy system, or with input designed for such a system, especiall ...
with CPUs back to the
Intel 8008 The Intel 8008 ("''eight-thousand-eight''" or "''eighty-oh-eight''") is an early byte-oriented microprocessor designed by Computer Terminal Corporation (CTC), implemented and manufactured by Intel, and introduced in April 1972. It is an 8-bit CP ...
microprocessor, which was launched in April 1972. It is used to produce
object code In computing, object code or object module is the product of a compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ...
for the x86 class of processors. Regarded as a
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
, assembly is ''machine-specific'' and '' low-level''. Like all assembly languages, x86 assembly uses
mnemonic A mnemonic ( ) device, or memory device, is any learning technique that aids information retention or retrieval (remembering) in the human memory for better understanding. Mnemonics make use of elaborative encoding, retrieval cues, and image ...
s to represent fundamental CPU instructions, or
machine code In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a ve ...
. Assembly languages are most often used for detailed and time-critical applications such as small
real-time Real-time or real time describes various operations in computing or other processes that must guarantee response times within a specified time (deadline), usually a relatively short time. A real-time process is generally one that happens in defined ...
embedded system An embedded system is a computer system—a combination of a computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or electronic system. It is ''embedded ...
s, operating-system kernels, and
device driver In computing, a device driver is a computer program that operates or controls a particular type of device that is attached to a computer or automaton. A driver provides a software interface to hardware devices, enabling operating systems and o ...
s, but can also be used for other applications, such as the game '' Roller Coaster Tycoon'', 99% of which was written in x86 assembly. A
compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs tha ...
will sometimes produce assembly code as an intermediate step when translating a high-level program into machine code.


Keywords


Mnemonics and opcodes

Each x86 assembly instruction is represented by a
mnemonic A mnemonic ( ) device, or memory device, is any learning technique that aids information retention or retrieval (remembering) in the human memory for better understanding. Mnemonics make use of elaborative encoding, retrieval cues, and image ...
which, often combined with one or more operands, translates to one or more bytes called an
opcode In computing, an opcode (abbreviated from operation code, also known as instruction machine code, instruction code, instruction syllable, instruction parcel or opstring) is the portion of a machine language instruction that specifies the operat ...
; the NOP instruction translates to 0x90, for instance, and the HLT instruction translates to 0xF4. There are potential
opcode In computing, an opcode (abbreviated from operation code, also known as instruction machine code, instruction code, instruction syllable, instruction parcel or opstring) is the portion of a machine language instruction that specifies the operat ...
s with no documented mnemonic which different processors may interpret differently, making a program using them behave inconsistently or even generate an exception on some processors. These opcodes often turn up in code writing competitions as a way to make the code smaller, faster, more elegant or just show off the author's prowess.


Syntax

x86 assembly language has two main
syntax In linguistics, syntax () is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure ( constituenc ...
branches: ''
Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California. It is the world's largest semiconductor chip manufacturer by revenue, and is one of the developers of the x86 ser ...
syntax'' and '' AT&T syntax''. ''Intel syntax'' is dominant in the
DOS DOS is shorthand for the MS-DOS and IBM PC DOS family of operating systems. DOS may also refer to: Computing * Data over signalling (DoS), multiplexing data onto a signalling channel * Denial-of-service attack (DoS), an attack on a communicat ...
and
Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for se ...
world, and AT&T syntax is dominant in the
Unix Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, ...
world, since Unix was created at
AT&T Bell Labs Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984), then AT&T Bell Laboratories (1984–1996) and Bell Labs Innovations (1996–2007), is an American industrial research and scientific development company owned by mult ...
. Here is a summary of the main differences between ''Intel syntax'' and ''AT&T syntax'': Many x86 assemblers use ''Intel syntax'', including FASM, MASM, NASM, TASM, and YASM. GAS, which originally used ''AT&T syntax'', has supported both syntaxes since version 2.10 via the ''.intel_syntax'' directive. A quirk in the AT&T syntax for x86 is that
x87 x87 is a floating-point-related subset of the x86 architecture instruction set. It originated as an extension of the 8086 instruction set in the form of optional floating-point coprocessors that worked in tandem with corresponding x86 CPUs. These ...
operands are reversed, an inherited bug from the original AT&T assembler. The AT&T syntax is nearly universal to all other architectures with the same order; it was originally a syntax for PDP-11 assembly. The Intel syntax is specific to the
x86 architecture x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was int ...
, and is the one used in the x86 platform's documentation.


Registers

x86 processors have a collection of registers available to be used as stores for binary data. Collectively the data and address registers are called the general registers. Each register has a special purpose in addition to what they can all do: * AX multiply/divide, string load & store * BX index register for MOVE * CX count for string operations & shifts * DX
port A port is a maritime facility comprising one or more wharves or loading areas, where ships load and discharge cargo and passengers. Although usually situated on a sea coast or estuary, ports can also be found far inland, such as H ...
address for IN and OUT * SP points to top of
the stack The Stack is a colloquialism used to describe the symmetrical, four-level stack interchange in Phoenix, Arizona that facilitates movements between Interstate 17/U.S. Route 60 and Interstate 10. Description In 2006, the Stack interchange saw an ...
* BP points to base of the stack frame * SI points to a source in stream operations * DI points to a destination in stream operations Along with the general registers there are additionally the: * IP instruction pointer * FLAGS * segment registers (CS, DS, ES, FS, GS, SS) which determine where a 64k segment starts (no FS & GS in
80286 The Intel 80286 (also marketed as the iAPX 286 and often called Intel 286) is a 16-bit microprocessor that was introduced on February 1, 1982. It was the first 8086-based CPU with separate, non- multiplexed address and data buses and also the ...
& earlier) * extra extension registers (
MMX MMX may refer to: * 2010, in Roman numerals Science and technology * MMX (instruction set), a single-instruction, multiple-data instruction set designed by Intel * MMX Mineração, a Brazilian mining company * Martian Moons eXploration, a Japane ...
,
3DNow! 3DNow! is a deprecated extension to the x86 instruction set developed by Advanced Micro Devices (AMD). It adds single instruction multiple data (SIMD) instructions to the base x86 instruction set, enabling it to perform vector processing of fl ...
, SSE, etc.) (Pentium & later only). The IP register points to the memory offset of the next instruction in the code segment (it points to the first byte of the instruction). The IP register cannot be accessed by the programmer directly. The x86 registers can be used by using the MOV instructions. For example, in Intel syntax: mov ax, 1234h ; copies the value 1234hex (4660d) into register AX mov bx, ax ; copies the value of the AX register into the BX register


Segmented addressing

The
x86 architecture x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was int ...
in
real Real may refer to: Currencies * Brazilian real (R$) * Central American Republic real * Mexican real * Portuguese real * Spanish real * Spanish colonial real Music Albums * ''Real'' (L'Arc-en-Ciel album) (2000) * ''Real'' (Bright album) (2010) ...
and
virtual 8086 mode In the 80386 microprocessor and later, virtual 8086 mode (also called virtual real mode, V86-mode, or VM86) allows the execution of real mode applications that are incapable of running directly in protected mode while the processor is running a ...
uses a process known as segmentation to address memory, not the flat memory model used in many other environments. Segmentation involves composing a memory address from two parts, a ''segment'' and an ''offset''; the segment points to the beginning of a 64 KB (64×210) group of addresses and the offset determines how far from this beginning address the desired address is. In segmented addressing, two registers are required for a complete memory address. One to hold the segment, the other to hold the offset. In order to translate back into a flat address, the segment value is shifted four bits left (equivalent to multiplication by 24 or 16) then added to the offset to form the full address, which allows breaking the
64k barrier 16-bit microcomputers are microcomputers that use 16-bit microprocessors. A 16-bit register can store 216 different values. The range of integer values that can be stored in 16 bits depends on the integer representation used. With the two most ...
through clever choice of addresses, though it makes programming considerably more complex. In
real mode Real mode, also called real address mode, is an operating mode of all x86-compatible CPUs. The mode gets its name from the fact that addresses in real mode always correspond to real locations in memory. Real mode is characterized by a 20- bit ...
/protected only, for example, if DS contains the
hexadecimal In mathematics and computing, the hexadecimal (also base-16 or simply hex) numeral system is a positional numeral system that represents numbers using a radix (base) of 16. Unlike the decimal system representing numbers using 10 symbols, he ...
number 0xDEAD and DX contains the number 0xCAFE they would together point to the memory address 0xDEAD * 0x10 + 0xCAFE = 0xEB5CE. Therefore, the CPU can address up to 1,048,576 bytes (1 MB) in real mode. By combining ''segment'' and ''offset'' values we find a 20-bit address. The original IBM PC restricted programs to 640 KB but an
expanded memory In DOS memory management, expanded memory is a system of bank switching that provided additional memory to DOS programs beyond the limit of conventional memory (640 KiB). ''Expanded memory'' is an umbrella term for several incompatible tec ...
specification was used to implement a bank switching scheme that fell out of use when later operating systems, such as Windows, used the larger address ranges of newer processors and implemented their own virtual memory schemes. Protected mode, starting with the Intel 80286, was utilized by
OS/2 OS/2 (Operating System/2) is a series of computer operating systems, initially created by Microsoft and IBM under the leadership of IBM software designer Ed Iacobucci. As a result of a feud between the two companies over how to position OS/2 r ...
. Several shortcomings, such as the inability to access the BIOS and the inability to switch back to real mode without resetting the processor, prevented widespread usage. The 80286 was also still limited to addressing memory in 16-bit segments, meaning only 216 bytes (64
kilobyte The kilobyte is a multiple of the unit byte for digital information. The International System of Units (SI) defines the prefix '' kilo'' as 1000 (103); per this definition, one kilobyte is 1000 bytes.International Standard IEC 80000-13 Quant ...
s) could be accessed at a time. To access the extended functionality of the 80286, the operating system would set the processor into protected mode, enabling 24-bit addressing and thus 224 bytes of memory (16
megabyte The megabyte is a multiple of the unit byte for digital information. Its recommended unit symbol is MB. The unit prefix ''mega'' is a multiplier of (106) in the International System of Units (SI). Therefore, one megabyte is one million bytes o ...
s). In
protected mode In computing, protected mode, also called protected virtual address mode, is an operational mode of x86-compatible central processing units (CPUs). It allows system software to use features such as virtual memory, paging and safe multi-taskin ...
, the segment selector can be broken down into three parts: a 13-bit index, a ''Table Indicator'' bit that determines whether the entry is in the GDT or LDT and a 2-bit ''Requested Privilege Level''; see x86 memory segmentation. When referring to an address with a segment and an offset the notation of ''segment'':''offset'' is used, so in the above example the ''flat address'' 0xEB5CE can be written as 0xDEAD:0xCAFE or as a segment and offset register pair; DS:DX. There are some special combinations of segment registers and general registers that point to important addresses: *CS:IP (CS is ''Code Segment'', IP is ''Instruction Pointer'') points to the address where the processor will fetch the next byte of code. *SS:SP (SS is ''Stack Segment'', SP is ''Stack Pointer'') points to the address of the top of the stack, i.e. the most recently pushed byte. *DS:SI (DS is ''Data Segment'', SI is ''Source Index'') is often used to point to string data that is about to be copied to ES:DI. *ES:DI (ES is ''Extra Segment'', DI is ''Destination Index'') is typically used to point to the destination for a string copy, as mentioned above. The Intel
80386 The Intel 386, originally released as 80386 and later renamed i386, is a 32-bit microprocessor introduced in 1985. The first versions had 275,000 transistorsprotected mode In computing, protected mode, also called protected virtual address mode, is an operational mode of x86-compatible central processing units (CPUs). It allows system software to use features such as virtual memory, paging and safe multi-taskin ...
which debuted in the 80286 was extended to allow the 80386 to address up to 4 GB of memory, the all new virtual 8086 mode (''VM86'') made it possible to run one or more real mode programs in a protected environment which largely emulated real mode, though some programs were not compatible (typically as a result of memory addressing tricks or using unspecified op-codes). The 32-bit flat memory model of the
80386 The Intel 386, originally released as 80386 and later renamed i386, is a 32-bit microprocessor introduced in 1985. The first versions had 275,000 transistorsAMD released
x86-64 x86-64 (also known as x64, x86_64, AMD64, and Intel 64) is a 64-bit version of the x86 instruction set, first released in 1999. It introduced two new modes of operation, 64-bit mode and compatibility mode, along with a new 4-level paging ...
in 2003, as it helped drive large scale adoption of Windows 3.1 (which relied on protected mode) since Windows could now run many applications at once, including DOS applications, by using virtual memory and simple multitasking.


Execution modes

The x86 processors support five modes of operation for x86 code, Real Mode, Protected Mode, Long Mode, Virtual 86 Mode, and System Management Mode, in which some instructions are available and others are not. A 16-bit subset of instructions is available on the 16-bit x86 processors, which are the 8086, 8088, 80186, 80188, and 80286. These instructions are available in real mode on all x86 processors, and in 16-bit protected mode (
80286 The Intel 80286 (also marketed as the iAPX 286 and often called Intel 286) is a 16-bit microprocessor that was introduced on February 1, 1982. It was the first 8086-based CPU with separate, non- multiplexed address and data buses and also the ...
onwards), additional instructions relating to protected mode are available. On the
80386 The Intel 386, originally released as 80386 and later renamed i386, is a 32-bit microprocessor introduced in 1985. The first versions had 275,000 transistorsOpteron Opteron is AMD's x86 former server and workstation processor line, and was the first processor which supported the AMD64 instruction set architecture (known generically as x86-64 or AMD64). It was released on April 22, 2003, with the ''Sledg ...
onwards), 64-bit instructions, and more registers, are also available. The instruction set is similar in each mode but memory addressing and word size vary, requiring different programming strategies. The modes in which x86 code can be executed in are: *
Real mode Real mode, also called real address mode, is an operating mode of all x86-compatible CPUs. The mode gets its name from the fact that addresses in real mode always correspond to real locations in memory. Real mode is characterized by a 20- bit ...
(16-bit) ** 20-bit segmented memory address space (meaning that only 1 MB of memory can be addressed—actually, slightly more), direct software access to peripheral hardware, and no concept of
memory protection Memory protection is a way to control memory access rights on a computer, and is a part of most modern instruction set architectures and operating systems. The main purpose of memory protection is to prevent a process from accessing memory that h ...
or multitasking at the hardware level. Computers that use
BIOS In computing, BIOS (, ; Basic Input/Output System, also known as the System BIOS, ROM BIOS, BIOS ROM or PC BIOS) is firmware used to provide runtime services for operating systems and programs and to perform hardware initialization during the b ...
start up in this mode. *
Protected mode In computing, protected mode, also called protected virtual address mode, is an operational mode of x86-compatible central processing units (CPUs). It allows system software to use features such as virtual memory, paging and safe multi-taskin ...
(16-bit and 32-bit) ** Expands addressable physical memory to 16 MB and addressable
virtual memory In computing, virtual memory, or virtual storage is a memory management technique that provides an "idealized abstraction of the storage resources that are actually available on a given machine" which "creates the illusion to users of a very ...
to 1 GB. Provides privilege levels and protected memory, which prevents programs from corrupting one another. 16-bit protected mode (used during the end of the
DOS DOS is shorthand for the MS-DOS and IBM PC DOS family of operating systems. DOS may also refer to: Computing * Data over signalling (DoS), multiplexing data onto a signalling channel * Denial-of-service attack (DoS), an attack on a communicat ...
era) used a complex, multi-segmented memory model. 32-bit protected mode uses a simple, flat memory model. *
Long mode In the x86-64 computer architecture, long mode is the mode where a 64-bit operating system can access 64-bit instructions and registers. 64-bit programs are run in a sub-mode called 64-bit mode, while 32-bit programs and 16-bit protected mode ...
(64-bit) ** Mostly an extension of the 32-bit (protected mode) instruction set, but unlike the 16–to–32-bit transition, many instructions were dropped in the 64-bit mode. Pioneered by AMD. *
Virtual 8086 mode In the 80386 microprocessor and later, virtual 8086 mode (also called virtual real mode, V86-mode, or VM86) allows the execution of real mode applications that are incapable of running directly in protected mode while the processor is running a ...
(16-bit) ** A special hybrid operating mode that allows real mode programs and operating systems to run while under the control of a protected mode supervisor operating system * System Management Mode (16-bit) ** Handles system-wide functions like power management, system hardware control, and proprietary OEM designed code. It is intended for use only by system firmware. All normal execution, including the
operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ef ...
, is suspended. An alternate software system (which usually resides in the computer's
firmware In computing, firmware is a specific class of computer software that provides the low-level control for a device's specific hardware. Firmware, such as the BIOS of a personal computer, may contain basic functions of a device, and may provide h ...
, or a hardware-assisted debugger) is then executed with high privileges.


Switching modes

The processor runs in real mode immediately after power on, so an
operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ef ...
kernel Kernel may refer to: Computing * Kernel (operating system), the central component of most operating systems * Kernel (image processing), a matrix used for image convolution * Compute kernel, in GPGPU programming * Kernel method, in machine learn ...
, or other program, must explicitly switch to another mode if it wishes to run in anything but real mode. Switching modes is accomplished by modifying certain bits of the processor's
control register A control register is a processor register which changes or controls the general behavior of a CPU or other digital device. Common tasks performed by control registers include interrupt control, switching the addressing mode, paging control, ...
s after some preparation, and some additional setup may be required after the switch.


Examples

With a computer running legacy
BIOS In computing, BIOS (, ; Basic Input/Output System, also known as the System BIOS, ROM BIOS, BIOS ROM or PC BIOS) is firmware used to provide runtime services for operating systems and programs and to perform hardware initialization during the b ...
, the BIOS and the boot loader is running in
Real mode Real mode, also called real address mode, is an operating mode of all x86-compatible CPUs. The mode gets its name from the fact that addresses in real mode always correspond to real locations in memory. Real mode is characterized by a 20- bit ...
, then the 64-bit operating system kernel checks and switches the CPU into Long mode and then starts new
kernel-mode In computer science, hierarchical protection domains, often called protection rings, are mechanisms to protect data and functionality from faults (by improving fault tolerance) and malicious behavior (by providing computer security). Compu ...
threads running 64-bit code. With a computer running
UEFI UEFI (Unified Extensible Firmware Interface) is a set of specifications written by the UEFI Forum. They define the architecture of the platform firmware used for booting and its interface for interaction with the operating system. Examples ...
, the UEFI firmware (except CSM and legacy
Option ROM An Option ROM for the PC platform (i.e. the IBM PC and derived successor computer systems) is a piece of firmware that resides in ROM on an expansion card (or stored along with the main system BIOS), which gets executed to initialize the device an ...
), the UEFI boot loader and the UEFI operating system kernel is all running in Long mode.


Instruction types

In general, the features of the modern
x86 instruction set The x86 instruction set refers to the set of instructions that x86-compatible microprocessors support. The instructions are usually part of an executable program, often stored as a computer file and executed on the processor. The x86 instructi ...
are: * A compact encoding ** Variable length and alignment independent (encoded as
little endian In computing, endianness, also known as byte sex, is the order or sequence of bytes of a word of digital data in computer memory. Endianness is primarily expressed as big-endian (BE) or little-endian (LE). A big-endian system stores the most si ...
, as is all data in the x86 architecture) ** Mainly one-address and two-address instructions, that is to say, the first
operand In mathematics, an operand is the object of a mathematical operation, i.e., it is the object or quantity that is operated on. Example The following arithmetic expression shows an example of operators and operands: :3 + 6 = 9 In the above exam ...
is also the destination. ** Memory operands as both source and destination are supported (frequently used to read/write stack elements addressed using small immediate offsets). ** Both general and implicit
register Register or registration may refer to: Arts entertainment, and media Music * Register (music), the relative "height" or range of a note, melody, part, instrument, etc. * ''Register'', a 2017 album by Travis Miller * Registration (organ), th ...
usage; although all seven (counting ebp) general registers in 32-bit mode, and all fifteen (counting rbp) general registers in 64-bit mode, can be freely used as accumulators or for addressing, most of them are also ''implicitly'' used by certain (more or less) special instructions; affected registers must therefore be temporarily preserved (normally stacked), if active during such instruction sequences. * Produces conditional flags implicitly through most integer ALU instructions. * Supports various addressing modes including immediate, offset, and scaled index but not PC-relative, except jumps (introduced as an improvement in the
x86-64 x86-64 (also known as x64, x86_64, AMD64, and Intel 64) is a 64-bit version of the x86 instruction set, first released in 1999. It introduced two new modes of operation, 64-bit mode and compatibility mode, along with a new 4-level paging ...
architecture). * Includes
floating point In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can ...
to a stack of registers. * Contains special support for atomic read-modify-write instructions (xchg, cmpxchg/cmpxchg8b, xadd, and integer instructions which combine with the lock prefix) *
SIMD Single instruction, multiple data (SIMD) is a type of parallel processing in Flynn's taxonomy. SIMD can be internal (part of the hardware design) and it can be directly accessible through an instruction set architecture (ISA), but it shoul ...
instructions (instructions which perform parallel simultaneous single instructions on many operands encoded in adjacent cells of wider registers).


Stack instructions

The x86 architecture has hardware support for an execution stack mechanism. Instructions such as push, pop, call and ret are used with the properly set up stack to pass parameters, to allocate space for local data, and to save and restore call-return points. The ret ''size'' instruction is very useful for implementing space efficient (and fast)
calling convention In computer science, a calling convention is an implementation-level (low-level) scheme for how subroutines or functions receive parameters from their caller and how they return a result. When some code calls a function, design choices have bee ...
s where the callee is responsible for reclaiming stack space occupied by parameters. When setting up a stack frame to hold local data of a recursive procedure there are several choices; the high level enter instruction (introduced with the 80186) takes a ''procedure-nesting-depth'' argument as well as a ''local size'' argument, and ''may'' be faster than more explicit manipulation of the registers (such as push bp ; mov bp, sp ; sub sp, ''size''). Whether it is faster or slower depends on the particular x86-processor implementation as well as the calling convention used by the compiler, programmer or particular program code; most x86 code is intended to run on x86-processors from several manufacturers and on different technological generations of processors, which implies highly varying
microarchitecture In computer engineering, microarchitecture, also called computer organization and sometimes abbreviated as µarch or uarch, is the way a given instruction set architecture (ISA) is implemented in a particular processor. A given ISA may be imp ...
s and
microcode In processor design, microcode (μcode) is a technique that interposes a layer of computer organization between the central processing unit (CPU) hardware and the programmer-visible instruction set architecture of a computer. Microcode is a la ...
solutions as well as varying
gate A gate or gateway is a point of entry to or from a space enclosed by walls. The word derived from old Norse "gat" meaning road or path; But other terms include ''yett and port''. The concept originally referred to the gap or hole in the wall ...
- and
transistor upright=1.4, gate (G), body (B), source (S) and drain (D) terminals. The gate is separated from the body by an insulating layer (pink). A transistor is a semiconductor device used to Electronic amplifier, amplify or electronic switch, switch ...
-level design choices. The full range of addressing modes (including ''immediate'' and ''base+offset'') even for instructions such as push and pop, makes direct usage of the stack for
integer An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign ( −1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the languag ...
,
floating point In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can ...
and address data simple, as well as keeping the ABI specifications and mechanisms relatively simple compared to some RISC architectures (require more explicit call stack details).


Integer ALU instructions

x86 assembly has the standard mathematical operations, add, sub, mul, with idiv; the
logical operator In logic, a logical connective (also called a logical operator, sentential connective, or sentential operator) is a logical constant. They can be used to connect logical formulas. For instance in the syntax of propositional logic, the binary ...
s and, or, xor, neg;
bitshift In computer programming, a bitwise operation operates on a bit string, a bit array or a binary numeral (considered as a bit string) at the level of its individual bits. It is a fast and simple action, basic to the higher-level arithmetic operat ...
arithmetic and logical, sal/sar, shl/shr; rotate with and without carry, rcl/rcr, rol/ror, a complement of BCD arithmetic instructions, aaa, aad, daa and others.


Floating-point instructions

x86 assembly language includes instructions for a stack-based floating-point unit (FPU). The FPU was an optional separate coprocessor for the 8086 through the 80386, it was an on-chip option for the 80486 series, and it is a standard feature in every Intel x86 CPU since the 80486, starting with the Pentium. The FPU instructions include addition, subtraction, negation, multiplication, division, remainder, square roots, integer truncation, fraction truncation, and scale by power of two. The operations also include conversion instructions, which can load or store a value from memory in any of the following formats: binary-coded decimal, 32-bit integer, 64-bit integer, 32-bit floating-point, 64-bit floating-point or 80-bit floating-point (upon loading, the value is converted to the currently used floating-point mode). x86 also includes a number of transcendental functions, including sine, cosine, tangent, arctangent, exponentiation with the base 2 and logarithms to bases 2, 10, or ''e''. The stack register to stack register format of the instructions is usually f''op'' st, st(''n'') or f''op'' st(''n''), st, where st is equivalent to st(0), and st(''n'') is one of the 8 stack registers (st(0), st(1), ..., st(7)). Like the integers, the first operand is both the first source operand and the destination operand. fsubr and fdivr should be singled out as first swapping the source operands before performing the subtraction or division. The addition, subtraction, multiplication, division, store and comparison instructions include instruction modes that pop the top of the stack after their operation is complete. So, for example, faddp st(1), st performs the calculation st(1) = st(1) + st(0), then removes st(0) from the top of stack, thus making what was the result in st(1) the top of the stack in st(0).


SIMD instructions

Modern x86 CPUs contain
SIMD Single instruction, multiple data (SIMD) is a type of parallel processing in Flynn's taxonomy. SIMD can be internal (part of the hardware design) and it can be directly accessible through an instruction set architecture (ISA), but it shoul ...
instructions, which largely perform the same operation in parallel on many values encoded in a wide SIMD register. Various instruction technologies support different operations on different register sets, but taken as complete whole (from
MMX MMX may refer to: * 2010, in Roman numerals Science and technology * MMX (instruction set), a single-instruction, multiple-data instruction set designed by Intel * MMX Mineração, a Brazilian mining company * Martian Moons eXploration, a Japane ...
to SSE4.2) they include general computations on integer or floating-point arithmetic (addition, subtraction, multiplication, shift, minimization, maximization, comparison, division or square root). So for example, paddw mm0, mm1 performs 4 parallel 16-bit (indicated by the w) integer adds (indicated by the padd) of mm0 values to mm1 and stores the result in mm0.
Streaming SIMD Extensions In computing, Streaming SIMD Extensions (SSE) is a single instruction, multiple data ( SIMD) instruction set extension to the x86 architecture, designed by Intel and introduced in 1999 in their Pentium III series of Central processing units (CPU ...
or SSE also includes a floating-point mode in which only the very first value of the registers is actually modified (expanded in
SSE2 SSE2 (Streaming SIMD Extensions 2) is one of the Intel SIMD (Single Instruction, Multiple Data) processor supplementary instruction sets first introduced by Intel with the initial version of the Pentium 4 in 2000. It extends the earlier SSE i ...
). Some other unusual instructions have been added including a sum of absolute differences (used for motion estimation in
video compression In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compressio ...
, such as is done in
MPEG The Moving Picture Experts Group (MPEG) is an alliance of working groups established jointly by ISO and IEC that sets standards for media coding, including compression coding of audio, video, graphics, and genomic data; and transmission and f ...
) and a 16-bit multiply accumulation instruction (useful for software-based alpha-blending and
digital filter In signal processing, a digital filter is a system that performs mathematical operations on a sampled, discrete-time signal to reduce or enhance certain aspects of that signal. This is in contrast to the other major type of electronic filter, t ...
ing). SSE (since
SSE3 SSE3, Streaming SIMD Extensions 3, also known by its Intel code name Prescott New Instructions (PNI), is the third iteration of the SSE instruction set for the IA-32 (x86) architecture. Intel introduced SSE3 in early 2004 with the Prescott revis ...
) and
3DNow! 3DNow! is a deprecated extension to the x86 instruction set developed by Advanced Micro Devices (AMD). It adds single instruction multiple data (SIMD) instructions to the base x86 instruction set, enabling it to perform vector processing of fl ...
extensions include addition and subtraction instructions for treating paired floating-point values like complex numbers. These instruction sets also include numerous fixed sub-word instructions for shuffling, inserting and extracting the values around within the registers. In addition there are instructions for moving data between the integer registers and XMM (used in SSE)/FPU (used in MMX) registers.


Memory instructions

The x86 processor also includes complex addressing modes for addressing memory with an immediate offset, a register, a register with an offset, a scaled register with or without an offset, and a register with an optional offset and another scaled register. So for example, one can encode mov eax, able + ebx + esi*4/code> as a single instruction which loads 32 bits of data from the address computed as (Table + ebx + esi * 4) offset from the ds selector, and stores it to the eax register. In general x86 processors can load and use memory matched to the size of any register it is operating on. (The SIMD instructions also include half-load instructions.) Most 2-operand x86 instructions, including integer ALU instructions, use a standard " addressing mode byte" often called the MOD-REG-R/M byte. Many 32-bit x86 instructions also have a SIB addressing mode byte that follows the MOD-REG-R/M byte. Stephen McCamant
"Manual and Automated Binary Reverse Engineering"
In principle, because the instruction opcode is separate from the addressing mode byte, those instructions are orthogonal because any of those opcodes can be mixed-and-matched with any addressing mode. However, the x86 instruction set is generally considered non-orthogonal because many other opcodes have some fixed addressing mode (they have no addressing mode byte), and every register is special. The x86 instruction set includes string load, store, move, scan and compare instructions (lods, stos, movs, scas and cmps) which perform each operation to a specified size (b for 8-bit byte, w for 16-bit word, d for 32-bit double word) then increments/decrements (depending on DF, direction flag) the implicit address register (si for lods, di for stos and scas, and both for movs and cmps). For the load, store and scan operations, the implicit target/source/comparison register is in the al, ax or eax register (depending on size). The implicit segment registers used are ds for si and es for di. The cx or ecx register is used as a decrementing counter, and the operation stops when the counter reaches zero or (for scans and comparisons) when inequality is detected. The stack is implemented with an implicitly decrementing (push) and incrementing (pop) stack pointer. In 16-bit mode, this implicit stack pointer is addressed as SS: P in 32-bit mode it is SS: SP and in 64-bit mode it is SP The stack pointer actually points to the last value that was stored, under the assumption that its size will match the operating mode of the processor (i.e., 16, 32, or 64 bits) to match the default width of the push/pop/call/ret instructions. Also included are the instructions enter and leave which reserve and remove data from the top of the stack while setting up a stack frame pointer in bp/ebp/rbp. However, direct setting, or addition and subtraction to the sp/esp/rsp register is also supported, so the enter/leave instructions are generally unnecessary. This code in the beginning of a function: push ebp ; save calling function's stack frame (ebp) mov ebp, esp ; make a new stack frame on top of our caller's stack sub esp, 4 ; allocate 4 bytes of stack space for this function's local variables ...is functionally equivalent to just: enter 4, 0 Other instructions for manipulating the stack include pushf/popf for storing and retrieving the (E)FLAGS register. The pusha/popa instructions will store and retrieve the entire integer register state to and from the stack. Values for a SIMD load or store are assumed to be packed in adjacent positions for the SIMD register and will align them in sequential little-endian order. Some SSE load and store instructions require 16-byte alignment to function properly. The SIMD instruction sets also include "prefetch" instructions which perform the load but do not target any register, used for cache loading. The SSE instruction sets also include non-temporal store instructions which will perform stores straight to memory without performing a cache allocate if the destination is not already cached (otherwise it will behave like a regular store.) Most generic integer and floating-point (but no SIMD) instructions can use one parameter as a complex address as the second source parameter. Integer instructions can also accept one memory parameter as a destination operand.


Program flow

The x86 assembly has an unconditional jump operation, jmp, which can take an immediate address, a register or an indirect address as a parameter (note that most RISC processors only support a link register or short immediate displacement for jumping). Also supported are several conditional jumps, including jz (jump on zero), jnz (jump on non-zero), jg (jump on greater than, signed), jl (jump on less than, signed), ja (jump on above/greater than, unsigned), jb (jump on below/less than, unsigned). These conditional operations are based on the state of specific bits in the (E)FLAGS register. Many arithmetic and logic operations set, clear or complement these flags depending on their result. The comparison cmp (compare) and test instructions set the flags as if they had performed a subtraction or a bitwise AND operation, respectively, without altering the values of the operands. There are also instructions such as clc (clear carry flag) and cmc (complement carry flag) which work on the flags directly. Floating point comparisons are performed via fcom or ficom instructions which eventually have to be converted to integer flags. Each jump operation has three different forms, depending on the size of the operand. A ''short'' jump uses an 8-bit signed operand, which is a relative offset from the current instruction. A ''near'' jump is similar to a short jump but uses a 16-bit signed operand (in real or protected mode) or a 32-bit signed operand (in 32-bit protected mode only). A ''far'' jump is one that uses the full segment base:offset value as an absolute address. There are also indirect and indexed forms of each of these. In addition to the simple jump operations, there are the call (call a subroutine) and ret (return from subroutine) instructions. Before transferring control to the subroutine, call pushes the segment offset address of the instruction following the call onto the stack; ret pops this value off the stack, and jumps to it, effectively returning the flow of control to that part of the program. In the case of a far call, the segment base is pushed following the offset; far ret pops the offset and then the segment base to return. There are also two similar instructions, int (
interrupt In digital computers, an interrupt (sometimes referred to as a trap) is a request for the processor to ''interrupt'' currently executing code (when permitted), so that the event can be processed in a timely manner. If the request is accepted, ...
), which saves the current (E)FLAGS register value on the stack, then performs a far call, except that instead of an address, it uses an ''interrupt vector'', an index into a table of interrupt handler addresses. Typically, the interrupt handler saves all other CPU registers it uses, unless they are used to return the result of an operation to the calling program (in software called interrupts). The matching return from interrupt instruction is iret, which restores the flags after returning. ''Soft Interrupts'' of the type described above are used by some operating systems for
system call In computing, a system call (commonly abbreviated to syscall) is the programmatic way in which a computer program requests a service from the operating system on which it is executed. This may include hardware-related services (for example, acc ...
s, and can also be used in debugging hard interrupt handlers. ''Hard interrupts'' are triggered by external hardware events, and must preserve all register values as the state of the currently executing program is unknown. In Protected Mode, interrupts may be set up by the OS to trigger a task switch, which will automatically save all registers of the active task.


Examples


"Hello world!" program for DOS in MASM style assembly

Using interrupt 21h for output – other samples use libc's printf to print to stdout. .model small .stack 100h .data msg db 'Hello world!$' .code start: mov ah, 09h ; Display the message lea dx, msg int 21h mov ax, 4C00h ; Terminate the executable int 21h end start


"Hello world!" program for Windows in MASM style assembly

; requires /coff switch on 6.15 and earlier versions .386 .model small,c .stack 1000h .data msg db "Hello world!",0 .code includelib libcmt.lib includelib libvcruntime.lib includelib libucrt.lib includelib legacy_stdio_definitions.lib extrn printf:near extrn exit:near public main main proc push offset msg call printf push 0 call exit main endp end


"Hello world!" program for Windows in NASM style assembly

; Image base = 0x00400000 %define RVA(x) (x-0x00400000) section .text push dword hello call dword rintfpush byte +0 call dword
xit XIT may refer to: *XIT (band), a Native American rock group * XIT, a name briefly used by the 1960s English pop group Consortium A consortium (plural: consortia) is an association of two or more individuals, companies, organizations or governm ...
ret section .data hello db "Hello world!" section .idata dd RVA(msvcrt_LookupTable) dd -1 dd 0 dd RVA(msvcrt_string) dd RVA(msvcrt_imports) times 5 dd 0 ; ends the descriptor table msvcrt_string dd "msvcrt.dll", 0 msvcrt_LookupTable: dd RVA(msvcrt_printf) dd RVA(msvcrt_exit) dd 0 msvcrt_imports: printf dd RVA(msvcrt_printf) exit dd RVA(msvcrt_exit) dd 0 msvcrt_printf: dw 1 dw "printf", 0 msvcrt_exit: dw 2 dw "exit", 0 dd 0


"Hello world!" program for Linux in NASM style assembly

; ; This program runs in 32-bit protected mode. ; build: nasm -f elf -F stabs name.asm ; link: ld -o name name.o ; ; In 64-bit long mode you can use 64-bit registers (e.g. rax instead of eax, rbx instead of ebx, etc.) ; Also change "-f elf " for "-f elf64" in build command. ; section .data ; section for initialized data str: db 'Hello world!', 0Ah ; message string with new-line char at the end (10 decimal) str_len: equ $ - str ; calcs length of string (bytes) by subtracting the str's start address ; from this address ($ symbol) section .text ; this is the code section global _start ; _start is the entry point and needs global scope to be 'seen' by the ; linker --equivalent to main() in C/C++ _start: ; definition of _start procedure begins here mov eax, 4 ; specify the sys_write function code (from OS vector table) mov ebx, 1 ; specify file descriptor stdout --in gnu/linux, everything's treated as a file, ; even hardware devices mov ecx, str ; move start _address_ of string message to ecx register mov edx, str_len ; move length of message (in bytes) int 80h ; interrupt kernel to perform the system call we just set up - ; in gnu/linux services are requested through the kernel mov eax, 1 ; specify sys_exit function code (from OS vector table) mov ebx, 0 ; specify return code for OS (zero tells OS everything went fine) int 80h ; interrupt kernel to perform system call (to exit)


"Hello world!" program for Linux in NASM style assembly using the

C standard library The C standard library or libc is the standard library for the C programming language, as specified in the ISO C standard. ISO/IEC (2018). '' ISO/IEC 9899:2018(E): Programming Languages - C §7'' Starting from the original ANSI C standard, it was ...

; ; This program runs in 32-bit protected mode. ; gcc links the standard-C library by default ; build: nasm -f elf -F stabs name.asm ; link: gcc -o name name.o ; ; In 64-bit long mode you can use 64-bit registers (e.g. rax instead of eax, rbx instead of ebx, etc..) ; Also change "-f elf " for "-f elf64" in build command. ; global main ;main must be defined as it being compiled against the C-Standard Library extern printf ;declares use of external symbol as printf is declared in a different object-module. ;Linker resolves this symbol later. segment .data ;section for initialized data string db 'Hello world!', 0Ah, 0h ;message string with new-line char (10 decimal) and the NULL terminator ;string now refers to the starting address at which 'Hello, World' is stored. segment .text main: push string ;push the address of first character of string onto stack. This will be argument to printf call printf ;calls printf add esp, 4 ;advances stack-pointer by 4 flushing out the pushed string argument ret ;return


"Hello world!" program for 64-bit mode Linux in NASM style assembly

; build: nasm -f elf64 -F dwarf hello.asm ; link: ld -o hello hello.o DEFAULT REL ; use RIP-relative addressing modes by default, so oo= el foo SECTION .rodata ; read-only data can go in the .rodata section on GNU/Linux, like .rdata on Windows Hello: db "Hello world!",10 ; 10 = `\n`. len_Hello: equ $-Hello ; get NASM to calculate the length as an assemble-time constant ;; write() takes a length so a 0-terminated C-style string isn't needed. It would be for puts SECTION .text global _start _start: mov eax, 1 ; __NR_write syscall number from Linux asm/unistd_64.h (x86_64) mov edi, 1 ; int fd = STDOUT_FILENO lea rsi, el Hello ; x86-64 uses RIP-relative LEA to put static addresses into regs mov rdx, len_Hello ; size_t count = len_Hello syscall ; write(1, Hello, len_Hello); call into the kernel to actually do the system call ;; return value in RAX. RCX and R11 are also overwritten by syscall mov eax, 60 ; __NR_exit call number (x86_64) xor edi, edi ; status = 0 (exit normally) syscall ; _exit(0) Running it under strace verifies that no extra system calls are made in the process. The printf version would make many more system calls to initialize libc and do
dynamic linking In computing, a dynamic linker is the part of an operating system that loads and links the shared libraries needed by an executable when it is executed (at " run time"), by copying the content of libraries from persistent storage to RAM, filling ...
. But this is a static executable because we linked using ld without -pie or any shared libraries; the only instructions that run in user-space are the ones you provide. $ strace ./hello > /dev/null # without a redirect, your program's stdout is mixed strace's logging on stderr. Which is normally fine execve("./hello", ./hello" 0x7ffc8b0b3570 /* 51 vars */) = 0 write(1, "Hello world!\n", 13) = 13 exit(0) = ? +++ exited with 0 +++


Using the flags register

Flags are heavily used for comparisons in the x86 architecture. When a comparison is made between two data, the CPU sets the relevant flag or flags. Following this, conditional jump instructions can be used to check the flags and branch to code that should run, e.g.: cmp eax, ebx jne do_something ; ... do_something: ; do something here Flags are also used in the x86 architecture to turn on and off certain features or execution modes. For example, to disable all maskable interrupts, you can use the instruction: cli The flags register can also be directly accessed. The low 8 bits of the flag register can be loaded into ah using the lahf instruction. The entire flags register can also be moved on and off the stack using the instructions pushf, popf, int (including into) and iret.


Using the instruction pointer register

The instruction pointer is called ip in 16-bit mode, eip in 32-bit mode, and rip in 64-bit mode. The instruction pointer register points to the memory address which the processor will next attempt to execute; it cannot be directly accessed in 16-bit or 32-bit mode, but a sequence like the following can be written to put the address of next_line into eax: call next_line next_line: pop eax This sequence of instructions generates
position-independent code In computing, position-independent code (PIC) or position-independent executable (PIE) is a body of machine code that, being placed somewhere in the primary memory, executes properly regardless of its absolute address. PIC is commonly used fo ...
because call takes an instruction-pointer-relative immediate operand describing the offset in bytes of the target instruction from the next instruction (in this case 0). Writing to the instruction pointer is simple — a jmp instruction sets the instruction pointer to the target address, so, for example, a sequence like the following will put the contents of eax into eip: jmp eax In 64-bit mode, instructions can reference data relative to the instruction pointer, so there is less need to copy the value of the instruction pointer to another register.


See also

*
Assembly language In computer programming, assembly language (or assembler language, or symbolic machine code), often referred to simply as Assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence b ...
* X86 instruction listings *
X86 architecture x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was int ...
*
CPU design Processor design is a subfield of computer engineering and electronics engineering (fabrication) that deals with creating a processor, a key component of computer hardware. The design process involves choosing an instruction set and a certain exe ...
* List of assemblers * Self-modifying code *
DOS DOS is shorthand for the MS-DOS and IBM PC DOS family of operating systems. DOS may also refer to: Computing * Data over signalling (DoS), multiplexing data onto a signalling channel * Denial-of-service attack (DoS), an attack on a communicat ...


References


Further reading


Manuals


Intel 64 and IA-32 Software Developer Manuals

AMD64 Architecture Programmer's Manual (Volume 1-5)


Books

* {{DEFAULTSORT:X86 Assembly Language Assembly languages X86 architecture