Keywords
Mnemonics and opcodes
Each x86 assembly instruction is represented by a mnemonic which, often combined with one or more operands, translates to one or more bytes called anSyntax
x86 assembly language has two main.intel_syntax
'' directive. A quirk in the AT&T syntax for x86 is that Registers
x86 processors have a collection of registers available to be used as stores for binary data. Collectively the data and address registers are called the general registers. Each register has a special purpose in addition to what they can all do: * AX multiply/divide, string load & store * BX index register for MOVE * CX count for string operations & shifts * DX port address for IN and OUT * SP points to top of the stack * BP points to base of the stack frame * SI points to a source in stream operations * DI points to a destination in stream operations Along with the general registers there are additionally the: * IP instruction pointer * FLAGS * segment registers (CS, DS, ES, FS, GS, SS) which determine where a 64k segment starts (no FS & GS in 80286 & earlier) * extra extension registers ( MMX,Segmented addressing
The x86 architecture in real and virtual 8086 mode uses a process known as segmentation to address memory, not the flat memory model used in many other environments. Segmentation involves composing a memory address from two parts, a ''segment'' and an ''offset''; the segment points to the beginning of a 64 KB (64×210) group of addresses and the offset determines how far from this beginning address the desired address is. In segmented addressing, two registers are required for a complete memory address. One to hold the segment, the other to hold the offset. In order to translate back into a flat address, the segment value is shifted four bits left (equivalent to multiplication by 24 or 16) then added to the offset to form the full address, which allows breaking theExecution modes
The x86 processors support five modes of operation for x86 code, Real Mode, Protected Mode, Long Mode, Virtual 86 Mode, and System Management Mode, in which some instructions are available and others are not. A 16-bit subset of instructions is available on the 16-bit x86 processors, which are the 8086, 8088, 80186, 80188, and 80286. These instructions are available in real mode on all x86 processors, and in 16-bit protected mode ( 80286 onwards), additional instructions relating to protected mode are available. On theSwitching modes
The processor runs in real mode immediately after power on, so an operating system kernel, or other program, must explicitly switch to another mode if it wishes to run in anything but real mode. Switching modes is accomplished by modifying certain bits of the processor's control registers after some preparation, and some additional setup may be required after the switch.Examples
With a computer running legacyInstruction types
In general, the features of the modern x86 instruction set are: * A compact encoding ** Variable length and alignment independent (encoded as little endian, as is all data in the x86 architecture) ** Mainly one-address and two-address instructions, that is to say, the first operand is also the destination. ** Memory operands as both source and destination are supported (frequently used to read/write stack elements addressed using small immediate offsets). ** Both general and implicit register usage; although all seven (countingebp
) general registers in 32-bit mode, and all fifteen (counting rbp
) general registers in 64-bit mode, can be freely used as accumulators or for addressing, most of them are also ''implicitly'' used by certain (more or less) special instructions; affected registers must therefore be temporarily preserved (normally stacked), if active during such instruction sequences.
* Produces conditional flags implicitly through most integer ALU instructions.
* Supports various addressing modes including immediate, offset, and scaled index but not PC-relative, except jumps (introduced as an improvement in the x86-64 architecture).
* Includes floating point to a stack of registers.
* Contains special support for atomic read-modify-write instructions (xchg
, cmpxchg
/cmpxchg8b
, xadd
, and integer instructions which combine with the lock
prefix)
* SIMD instructions (instructions which perform parallel simultaneous single instructions on many operands encoded in adjacent cells of wider registers).
Stack instructions
The x86 architecture has hardware support for an execution stack mechanism. Instructions such aspush
, pop
, call
and ret
are used with the properly set up stack to pass parameters, to allocate space for local data, and to save and restore call-return points. The ret
''size'' instruction is very useful for implementing space efficient (and fast) calling conventions where the callee is responsible for reclaiming stack space occupied by parameters.
When setting up a stack frame to hold local data of a recursive procedure there are several choices; the high level enter
instruction (introduced with the 80186) takes a ''procedure-nesting-depth'' argument as well as a ''local size'' argument, and ''may'' be faster than more explicit manipulation of the registers (such as push bp
; mov bp, sp
; sub sp, ''size''
). Whether it is faster or slower depends on the particular x86-processor implementation as well as the calling convention used by the compiler, programmer or particular program code; most x86 code is intended to run on x86-processors from several manufacturers and on different technological generations of processors, which implies highly varying push
and pop
, makes direct usage of the stack for integer, floating point and address data simple, as well as keeping the ABI specifications and mechanisms relatively simple compared to some RISC architectures (require more explicit call stack details).
Integer ALU instructions
x86 assembly has the standard mathematical operations,add
, sub
, mul
, with idiv
; the logical operators and
, or
, xor
, neg
; sal
/sar
, shl
/shr
; rotate with and without carry, rcl
/rcr
, rol
/ror
, a complement of BCD arithmetic instructions, aaa
, aad
, daa
and others.
Floating-point instructions
x86 assembly language includes instructions for a stack-based floating-point unit (FPU). The FPU was an optional separate coprocessor for the 8086 through the 80386, it was an on-chip option for the 80486 series, and it is a standard feature in every Intel x86 CPU since the 80486, starting with the Pentium. The FPU instructions include addition, subtraction, negation, multiplication, division, remainder, square roots, integer truncation, fraction truncation, and scale by power of two. The operations also include conversion instructions, which can load or store a value from memory in any of the following formats: binary-coded decimal, 32-bit integer, 64-bit integer, 32-bit floating-point, 64-bit floating-point or 80-bit floating-point (upon loading, the value is converted to the currently used floating-point mode). x86 also includes a number of transcendental functions, including sine, cosine, tangent, arctangent, exponentiation with the base 2 and logarithms to bases 2, 10, or ''e''. The stack register to stack register format of the instructions is usuallyf''op'' st, st(''n'')
or f''op'' st(''n''), st
, where st
is equivalent to st(0)
, and st(''n'')
is one of the 8 stack registers (st(0)
, st(1)
, ..., st(7)
). Like the integers, the first operand is both the first source operand and the destination operand. fsubr
and fdivr
should be singled out as first swapping the source operands before performing the subtraction or division. The addition, subtraction, multiplication, division, store and comparison instructions include instruction modes that pop the top of the stack after their operation is complete. So, for example, faddp st(1), st
performs the calculation st(1) = st(1) + st(0)
, then removes st(0)
from the top of stack, thus making what was the result in st(1)
the top of the stack in st(0)
.
SIMD instructions
Modern x86 CPUs contain SIMD instructions, which largely perform the same operation in parallel on many values encoded in a wide SIMD register. Various instruction technologies support different operations on different register sets, but taken as complete whole (from MMX topaddw mm0, mm1
performs 4 parallel 16-bit (indicated by the w
) integer adds (indicated by the padd
) of mm0
values to mm1
and stores the result in mm0
. Memory instructions
The x86 processor also includes complex addressing modes for addressing memory with an immediate offset, a register, a register with an offset, a scaled register with or without an offset, and a register with an optional offset and another scaled register. So for example, one can encodemov eax, able + ebx + esi*4/code> as a single instruction which loads 32 bits of data from the address computed as (Table + ebx + esi * 4)
offset from the ds
selector, and stores it to the eax
register. In general x86 processors can load and use memory matched to the size of any register it is operating on. (The SIMD instructions also include half-load instructions.)
Most 2-operand x86 instructions, including integer ALU instructions,
use a standard " addressing mode byte"
often called the MOD-REG-R/M byte.
Many 32-bit x86 instructions also have a SIB addressing mode byte that follows the MOD-REG-R/M byte.
Stephen McCamant
"Manual and Automated Binary Reverse Engineering"
In principle, because the instruction opcode is separate from the addressing mode byte, those instructions are orthogonal because any of those opcodes can be mixed-and-matched with any addressing mode.
However, the x86 instruction set is generally considered non-orthogonal because many other opcodes have some fixed addressing mode (they have no addressing mode byte), and every register is special.
The x86 instruction set includes string load, store, move, scan and compare instructions (lods
, stos
, movs
, scas
and cmps
) which perform each operation to a specified size (b
for 8-bit byte, w
for 16-bit word, d
for 32-bit double word) then increments/decrements (depending on DF, direction flag) the implicit address register (si
for lods
, di
for stos
and scas
, and both for movs
and cmps
). For the load, store and scan operations, the implicit target/source/comparison register is in the al
, ax
or eax
register (depending on size). The implicit segment registers used are ds
for si
and es
for di
. The cx
or ecx
register is used as a decrementing counter, and the operation stops when the counter reaches zero or (for scans and comparisons) when inequality is detected.
The stack is implemented with an implicitly decrementing (push) and incrementing (pop) stack pointer. In 16-bit mode, this implicit stack pointer is addressed as SS: P in 32-bit mode it is SS: SP and in 64-bit mode it is SP The stack pointer actually points to the last value that was stored, under the assumption that its size will match the operating mode of the processor (i.e., 16, 32, or 64 bits) to match the default width of the push
/pop
/call
/ret
instructions. Also included are the instructions enter
and leave
which reserve and remove data from the top of the stack while setting up a stack frame pointer in bp
/ebp
/rbp
. However, direct setting, or addition and subtraction to the sp
/esp
/rsp
register is also supported, so the enter
/leave
instructions are generally unnecessary.
This code in the beginning of a function:
push ebp ; save calling function's stack frame (ebp)
mov ebp, esp ; make a new stack frame on top of our caller's stack
sub esp, 4 ; allocate 4 bytes of stack space for this function's local variables
...is functionally equivalent to just:
enter 4, 0
Other instructions for manipulating the stack include pushf
/popf
for storing and retrieving the (E)FLAGS register. The pusha
/popa
instructions will store and retrieve the entire integer register state to and from the stack.
Values for a SIMD load or store are assumed to be packed in adjacent positions for the SIMD register and will align them in sequential little-endian order. Some SSE load and store instructions require 16-byte alignment to function properly. The SIMD instruction sets also include "prefetch" instructions which perform the load but do not target any register, used for cache loading. The SSE instruction sets also include non-temporal store instructions which will perform stores straight to memory without performing a cache allocate if the destination is not already cached (otherwise it will behave like a regular store.)
Most generic integer and floating-point (but no SIMD) instructions can use one parameter as a complex address as the second source parameter. Integer instructions can also accept one memory parameter as a destination operand.
Program flow
The x86 assembly has an unconditional jump operation, jmp
, which can take an immediate address, a register or an indirect address as a parameter (note that most RISC processors only support a link register or short immediate displacement for jumping).
Also supported are several conditional jumps, including jz
(jump on zero), jnz
(jump on non-zero), jg
(jump on greater than, signed), jl
(jump on less than, signed), ja
(jump on above/greater than, unsigned), jb
(jump on below/less than, unsigned). These conditional operations are based on the state of specific bits in the (E)FLAGS register. Many arithmetic and logic operations set, clear or complement these flags depending on their result. The comparison cmp
(compare) and test
instructions set the flags as if they had performed a subtraction or a bitwise AND operation, respectively, without altering the values of the operands. There are also instructions such as clc
(clear carry flag) and cmc
(complement carry flag) which work on the flags directly. Floating point comparisons are performed via fcom
or ficom
instructions which eventually have to be converted to integer flags.
Each jump operation has three different forms, depending on the size of the operand. A ''short'' jump uses an 8-bit signed operand, which is a relative offset from the current instruction. A ''near'' jump is similar to a short jump but uses a 16-bit signed operand (in real or protected mode) or a 32-bit signed operand (in 32-bit protected mode only). A ''far'' jump is one that uses the full segment base:offset value as an absolute address. There are also indirect and indexed forms of each of these.
In addition to the simple jump operations, there are the call
(call a subroutine) and ret
(return from subroutine) instructions. Before transferring control to the subroutine, call
pushes the segment offset address of the instruction following the call
onto the stack; ret
pops this value off the stack, and jumps to it, effectively returning the flow of control to that part of the program. In the case of a far call
, the segment base is pushed following the offset; far ret
pops the offset and then the segment base to return.
There are also two similar instructions, int
( interrupt), which saves the current (E)FLAGS register value on the stack, then performs a far call
, except that instead of an address, it uses an ''interrupt vector'', an index into a table of interrupt handler addresses. Typically, the interrupt handler saves all other CPU registers it uses, unless they are used to return the result of an operation to the calling program (in software called interrupts). The matching return from interrupt instruction is iret
, which restores the flags after returning. ''Soft Interrupts'' of the type described above are used by some operating systems for system calls, and can also be used in debugging hard interrupt handlers. ''Hard interrupts'' are triggered by external hardware events, and must preserve all register values as the state of the currently executing program is unknown. In Protected Mode, interrupts may be set up by the OS to trigger a task switch, which will automatically save all registers of the active task.
Examples
"Hello world!" program for DOS in MASM style assembly
Using interrupt 21h for output – other samples use libc's printf to print to stdout.
.model small
.stack 100h
.data
msg db 'Hello world!$'
.code
start:
mov ah, 09h ; Display the message
lea dx, msg
int 21h
mov ax, 4C00h ; Terminate the executable
int 21h
end start
"Hello world!" program for Windows in MASM style assembly
; requires /coff switch on 6.15 and earlier versions
.386
.model small,c
.stack 1000h
.data
msg db "Hello world!",0
.code
includelib libcmt.lib
includelib libvcruntime.lib
includelib libucrt.lib
includelib legacy_stdio_definitions.lib
extrn printf:near
extrn exit:near
public main
main proc
push offset msg
call printf
push 0
call exit
main endp
end
"Hello world!" program for Windows in NASM style assembly
; Image base = 0x00400000
%define RVA(x) (x-0x00400000)
section .text
push dword hello
call dword rintf
C mathematical operations are a group of functions in the standard library of the C programming language implementing basic mathematical functions. All functions use floating-point numbers in one manner or another. Different C standards provide d ...
push byte +0
call dword xit XIT may refer to:
*XIT (band), a Native American rock group
* XIT, a name briefly used by the 1960s English pop group Consortium
*XIT Ranch
The XIT Ranch was a cattle ranch in the Texas Panhandle which operated from 1885 to 1912. Comprising over ...
ret
section .data
hello db "Hello world!"
section .idata
dd RVA(msvcrt_LookupTable)
dd -1
dd 0
dd RVA(msvcrt_string)
dd RVA(msvcrt_imports)
times 5 dd 0 ; ends the descriptor table
msvcrt_string dd "msvcrt.dll", 0
msvcrt_LookupTable:
dd RVA(msvcrt_printf)
dd RVA(msvcrt_exit)
dd 0
msvcrt_imports:
printf dd RVA(msvcrt_printf)
exit dd RVA(msvcrt_exit)
dd 0
msvcrt_printf:
dw 1
dw "printf", 0
msvcrt_exit:
dw 2
dw "exit", 0
dd 0
"Hello world!" program for Linux in NASM style assembly
;
; This program runs in 32-bit protected mode.
; build: nasm -f elf -F stabs name.asm
; link: ld -o name name.o
;
; In 64-bit long mode you can use 64-bit registers (e.g. rax instead of eax, rbx instead of ebx, etc.)
; Also change "-f elf " for "-f elf64" in build command.
;
section .data ; section for initialized data
str: db 'Hello world!', 0Ah ; message string with new-line char at the end (10 decimal)
str_len: equ $ - str ; calcs length of string (bytes) by subtracting the str's start address
; from this address ($ symbol)
section .text ; this is the code section
global _start ; _start is the entry point and needs global scope to be 'seen' by the
; linker --equivalent to main() in C/C++
_start: ; definition of _start procedure begins here
mov eax, 4 ; specify the sys_write function code (from OS vector table)
mov ebx, 1 ; specify file descriptor stdout --in gnu/linux, everything's treated as a file,
; even hardware devices
mov ecx, str ; move start _address_ of string message to ecx register
mov edx, str_len ; move length of message (in bytes)
int 80h ; interrupt kernel to perform the system call we just set up -
; in gnu/linux services are requested through the kernel
mov eax, 1 ; specify sys_exit function code (from OS vector table)
mov ebx, 0 ; specify return code for OS (zero tells OS everything went fine)
int 80h ; interrupt kernel to perform system call (to exit)
"Hello world!" program for Linux in NASM style assembly using the
C standard library
The C standard library or libc is the standard library for the C programming language, as specified in the ISO C standard.ISO/IEC (2018). '' ISO/IEC 9899:2018(E): Programming Languages - C §7'' Starting from the original ANSI C standard, it wa ...
;
; This program runs in 32-bit protected mode.
; gcc links the standard-C library by default
; build: nasm -f elf -F stabs name.asm
; link: gcc -o name name.o
;
; In 64-bit long mode you can use 64-bit registers (e.g. rax instead of eax, rbx instead of ebx, etc..)
; Also change "-f elf " for "-f elf64" in build command.
;
global main ;main must be defined as it being compiled against the C-Standard Library
extern printf ;declares use of external symbol as printf is declared in a different object-module.
;Linker resolves this symbol later.
segment .data ;section for initialized data
string db 'Hello world!', 0Ah, 0h ;message string with new-line char (10 decimal) and the NULL terminator
;string now refers to the starting address at which 'Hello, World' is stored.
segment .text
main:
push string ;push the address of first character of string onto stack. This will be argument to printf
call printf ;calls printf
add esp, 4 ;advances stack-pointer by 4 flushing out the pushed string argument
ret ;return
"Hello world!" program for 64-bit mode Linux in NASM style assembly
; build: nasm -f elf64 -F dwarf hello.asm
; link: ld -o hello hello.o
DEFAULT REL ; use RIP-relative addressing modes by default, so oo= el foo
EL, El or el may refer to:
Religion
* El (deity), a Semitic word for "God"
People
* EL (rapper) (born 1983), stage name of Elorm Adablah, a Ghanaian rapper and sound engineer
* El DeBarge, music artist
* El Franco Lee (1949–2016), American ...
SECTION .rodata ; read-only data can go in the .rodata section on GNU/Linux, like .rdata on Windows
Hello: db "Hello world!",10 ; 10 = `\n`.
len_Hello: equ $-Hello ; get NASM to calculate the length as an assemble-time constant
;; write() takes a length so a 0-terminated C-style string isn't needed. It would be for puts
SECTION .text
global _start
_start:
mov eax, 1 ; __NR_write syscall number from Linux asm/unistd_64.h (x86_64)
mov edi, 1 ; int fd = STDOUT_FILENO
lea rsi, el Hello
EL, El or el may refer to:
Religion
* El (deity), a Semitic word for "God"
People
* EL (rapper) (born 1983), stage name of Elorm Adablah, a Ghanaian rapper and sound engineer
* El DeBarge, music artist
* El Franco Lee (1949–2016), American ...
; x86-64 uses RIP-relative LEA to put static addresses into regs
mov rdx, len_Hello ; size_t count = len_Hello
syscall ; write(1, Hello, len_Hello); call into the kernel to actually do the system call
;; return value in RAX. RCX and R11 are also overwritten by syscall
mov eax, 60 ; __NR_exit call number (x86_64)
xor edi, edi ; status = 0 (exit normally)
syscall ; _exit(0)
Running it under strace verifies that no extra system calls are made in the process. The printf version would make many more system calls to initialize libc and do dynamic linking. But this is a static executable because we linked using ld without -pie or any shared libraries; the only instructions that run in user-space are the ones you provide.
$ strace ./hello > /dev/null # without a redirect, your program's stdout is mixed strace's logging on stderr. Which is normally fine
execve("./hello", ./hello" 0x7ffc8b0b3570 /* 51 vars */) = 0
write(1, "Hello world!\n", 13) = 13
exit(0) = ?
+++ exited with 0 +++
Using the flags register
Flags are heavily used for comparisons in the x86 architecture. When a comparison is made between two data, the CPU sets the relevant flag or flags. Following this, conditional jump instructions can be used to check the flags and branch to code that should run, e.g.:
cmp eax, ebx
jne do_something
; ...
do_something:
; do something here
Flags are also used in the x86 architecture to turn on and off certain features or execution modes. For example, to disable all maskable interrupts, you can use the instruction:
cli
The flags register can also be directly accessed. The low 8 bits of the flag register can be loaded into ah
using the lahf
instruction. The entire flags register can also be moved on and off the stack using the instructions pushf
, popf
, int
(including into
) and iret
.
Using the instruction pointer register
The instruction pointer is called ip
in 16-bit mode, eip
in 32-bit mode, and rip
in 64-bit mode. The instruction pointer register points to the memory address which the processor will next attempt to execute; it cannot be directly accessed in 16-bit or 32-bit mode, but a sequence like the following can be written to put the address of next_line
into eax
:
call next_line
next_line:
pop eax
This sequence of instructions generates position-independent code because call
takes an instruction-pointer-relative immediate operand describing the offset in bytes of the target instruction from the next instruction (in this case 0).
Writing to the instruction pointer is simple — a jmp
instruction sets the instruction pointer to the target address, so, for example, a sequence like the following will put the contents of eax
into eip
:
jmp eax
In 64-bit mode, instructions can reference data relative to the instruction pointer, so there is less need to copy the value of the instruction pointer to another register.
See also
* Assembly language
In computer programming, assembly language (or assembler language, or symbolic machine code), often referred to simply as Assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence be ...
* X86 instruction listings
* X86 architecture
* CPU design
* List of assemblers
This is an incomplete list of assemblers: computer programs that translate assembly language source code into binary programs. Some assemblers are components of a compiler system for a high level language and may have limited or no usable functio ...
* Self-modifying code
* DOS
References
Further reading
Manuals
Intel 64 and IA-32 Software Developer Manuals
AMD64 Architecture Programmer's Manual (Volume 1-5)
Books
*
{{DEFAULTSORT:X86 Assembly Language
Assembly languages
X86 architecture