HOME

TheInfoList



OR:

x86 assembly language is the name for the family of assembly languages which provide some level of
backward compatibility Backward compatibility (sometimes known as backwards compatibility) is a property of an operating system, product, or technology that allows for interoperability with an older legacy system, or with input designed for such a system, especially ...
with CPUs back to the
Intel 8008 The Intel 8008 ("''eight-thousand-eight''" or "''eighty-oh-eight''") is an early byte-oriented microprocessor designed by Computer Terminal Corporation (CTC), implemented and manufactured by Intel, and introduced in April 1972. It is an 8-bit CP ...
microprocessor, which was launched in April 1972. It is used to produce
object code In computing, object code or object module is the product of a compiler. In a general sense object code is a sequence of statements or instructions in a computer language, usually a machine code language (i.e., binary) or an intermediate lang ...
for the x86 class of processors. Regarded as a
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming l ...
, assembly is ''machine-specific'' and '' low-level''. Like all assembly languages, x86 assembly uses
mnemonic A mnemonic ( ) device, or memory device, is any learning technique that aids information retention or retrieval (remembering) in the human memory for better understanding. Mnemonics make use of elaborative encoding, retrieval cues, and image ...
s to represent fundamental CPU instructions, or
machine code In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a ver ...
. Assembly languages are most often used for detailed and time-critical applications such as small
real-time Real-time or real time describes various operations in computing or other processes that must guarantee response times within a specified time (deadline), usually a relatively short time. A real-time process is generally one that happens in defined ...
embedded system An embedded system is a computer system—a combination of a computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or electronic system. It is ''embedded'' ...
s, operating-system kernels, and
device driver In computing, a device driver is a computer program that operates or controls a particular type of device that is attached to a computer or automaton. A driver provides a software interface to hardware devices, enabling operating systems and o ...
s, but can also be used for other applications, such as the game ''
Roller Coaster Tycoon Roller may refer to: Birds *Roller, a bird of the family Coraciidae * Roller (pigeon), a domesticated breed or variety of pigeon Devices * Roller (agricultural tool), a non-powered tool for flattening ground * Road roller, a vehicle for comp ...
'', 99% of which was written in x86 assembly. A
compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs that ...
will sometimes produce assembly code as an intermediate step when translating a high-level program into machine code.


Keywords


Mnemonics and opcodes

Each x86 assembly instruction is represented by a
mnemonic A mnemonic ( ) device, or memory device, is any learning technique that aids information retention or retrieval (remembering) in the human memory for better understanding. Mnemonics make use of elaborative encoding, retrieval cues, and image ...
which, often combined with one or more operands, translates to one or more bytes called an
opcode In computing, an opcode (abbreviated from operation code, also known as instruction machine code, instruction code, instruction syllable, instruction parcel or opstring) is the portion of a machine language instruction that specifies the opera ...
; the NOP instruction translates to 0x90, for instance, and the HLT instruction translates to 0xF4. There are potential
opcode In computing, an opcode (abbreviated from operation code, also known as instruction machine code, instruction code, instruction syllable, instruction parcel or opstring) is the portion of a machine language instruction that specifies the opera ...
s with no documented mnemonic which different processors may interpret differently, making a program using them behave inconsistently or even generate an exception on some processors. These opcodes often turn up in code writing competitions as a way to make the code smaller, faster, more elegant or just show off the author's prowess.


Syntax

x86 assembly language has two main syntax branches: ''
Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, Santa Clara, California. It is the world's largest semiconductor chip manufacturer by revenue, and is one of the devel ...
syntax'' and '' AT&T syntax''. ''Intel syntax'' is dominant in the
DOS DOS is shorthand for the MS-DOS and IBM PC DOS family of operating systems. DOS may also refer to: Computing * Data over signalling (DoS), multiplexing data onto a signalling channel * Denial-of-service attack (DoS), an attack on a communicat ...
and
Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for ...
world, and AT&T syntax is dominant in the
Unix Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
world, since Unix was created at
AT&T Bell Labs Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984), then AT&T Bell Laboratories (1984–1996) and Bell Labs Innovations (1996–2007), is an American industrial research and scientific development company owned by mult ...
. Here is a summary of the main differences between ''Intel syntax'' and ''AT&T syntax'': Many x86 assemblers use ''Intel syntax'', including
FASM FASM (''flat assembler'') is an assembler for x86 processors. It supports Intel-style assembly language on the IA-32 and x86-64 computer architectures. It claims high speed, size optimizations, operating system (OS) portability, and macro a ...
,
MASM The Microsoft Macro Assembler (MASM) is an x86 assembler that uses the Intel syntax for MS-DOS and Microsoft Windows. Beginning with MASM 8.0, there are two versions of the assembler: One for 16-bit & 32-bit assembly sources, and another (ML64 ...
, NASM,
TASM Turbo Assembler (sometimes shortened to the name of the executable, TASM) is an assembler for software development published by Borland in 1989. It runs on and produces code for 16- or 32-bit x86 MS-DOS and compatible on Microsoft Windows. ...
, and YASM.
GAS Gas is one of the four fundamental states of matter (the others being solid, liquid, and plasma). A pure gas may be made up of individual atoms (e.g. a noble gas like neon), elemental molecules made from one type of atom (e.g. oxygen), or ...
, which originally used ''AT&T syntax'', has supported both syntaxes since version 2.10 via the ''.intel_syntax'' directive. A quirk in the AT&T syntax for x86 is that
x87 x87 is a floating-point-related subset of the x86 architecture instruction set. It originated as an extension of the 8086 instruction set in the form of optional floating-point coprocessors that worked in tandem with corresponding x86 CPUs. These ...
operands are reversed, an inherited bug from the original AT&T assembler. The AT&T syntax is nearly universal to all other architectures with the same order; it was originally a syntax for PDP-11 assembly. The Intel syntax is specific to the
x86 architecture x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was ...
, and is the one used in the x86 platform's documentation.


Registers

x86 processors have a collection of registers available to be used as stores for binary data. Collectively the data and address registers are called the general registers. Each register has a special purpose in addition to what they can all do: * AX multiply/divide, string load & store * BX index register for MOVE * CX count for string operations & shifts * DX
port A port is a maritime facility comprising one or more wharves or loading areas, where ships load and discharge cargo and passengers. Although usually situated on a sea coast or estuary, ports can also be found far inland, such as ...
address for IN and OUT * SP points to top of
the stack The Stack is a colloquialism used to describe the symmetrical, four-level stack interchange in Phoenix, Arizona that facilitates movements between Interstate 17/U.S. Route 60 and Interstate 10. Description In 2006, the Stack interchange saw an ...
* BP points to base of the stack frame * SI points to a source in stream operations * DI points to a destination in stream operations Along with the general registers there are additionally the: * IP instruction pointer *
FLAGS A flag is a piece of fabric (most often rectangular or quadrilateral) with a distinctive design and colours. It is used as a symbol, a signalling device, or for decoration. The term ''flag'' is also used to refer to the graphic design emplo ...
* segment registers (CS, DS, ES, FS, GS, SS) which determine where a 64k segment starts (no FS & GS in
80286 The Intel 80286 (also marketed as the iAPX 286 and often called Intel 286) is a 16-bit microprocessor that was introduced on February 1, 1982. It was the first 8086-based CPU with separate, non- multiplexed address and data buses and also th ...
& earlier) * extra extension registers ( MMX,
3DNow! 3DNow! is a deprecated extension to the x86 instruction set developed by Advanced Micro Devices (AMD). It adds single instruction multiple data (SIMD) instructions to the base x86 instruction set, enabling it to perform vector processing of flo ...
, SSE, etc.) (Pentium & later only). The IP register points to the memory offset of the next instruction in the code segment (it points to the first byte of the instruction). The IP register cannot be accessed by the programmer directly. The x86 registers can be used by using the MOV instructions. For example, in Intel syntax: mov ax, 1234h ; copies the value 1234hex (4660d) into register AX mov bx, ax ; copies the value of the AX register into the BX register


Segmented addressing

The
x86 architecture x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was ...
in
real Real may refer to: Currencies * Brazilian real (R$) * Central American Republic real * Mexican real * Portuguese real * Spanish real * Spanish colonial real Music Albums * ''Real'' (L'Arc-en-Ciel album) (2000) * ''Real'' (Bright album) (201 ...
and
virtual 8086 mode In the 80386 microprocessor and later, virtual 8086 mode (also called virtual real mode, V86-mode, or VM86) allows the execution of real mode applications that are incapable of running directly in protected mode while the processor is running ...
uses a process known as segmentation to address memory, not the flat memory model used in many other environments. Segmentation involves composing a memory address from two parts, a ''segment'' and an ''offset''; the segment points to the beginning of a 64 KB (64×210) group of addresses and the offset determines how far from this beginning address the desired address is. In segmented addressing, two registers are required for a complete memory address. One to hold the segment, the other to hold the offset. In order to translate back into a flat address, the segment value is shifted four bits left (equivalent to multiplication by 24 or 16) then added to the offset to form the full address, whi