x86 memory segmentation refers to the implementation of
memory segmentation
Memory segmentation is an operating system memory management technique of division of a computer's primary memory into segments or sections. In a computer system using segmentation, a reference to a memory location includes a value that identi ...
in the Intel
x86 computer
instruction set architecture
In computer science, an instruction set architecture (ISA), also called computer architecture, is an abstract model of a computer. A device that executes instructions described by that ISA, such as a central processing unit (CPU), is called an ...
. Segmentation was introduced on the
Intel 8086
The 8086 (also called iAPX 86) is a 16-bit microprocessor chip designed by Intel between early 1976 and June 8, 1978, when it was released. The Intel 8088, released July 1, 1979, is a slightly modified chip with an external 8-bit data bus (allow ...
in 1978 as a way to allow programs to address more than 64 KB (65,536
byte
The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit ...
s) of memory. The
Intel 80286
The Intel 80286 (also marketed as the iAPX 286 and often called Intel 286) is a 16-bit microprocessor that was introduced on February 1, 1982. It was the first 8086-based CPU with separate, non- multiplexed address and data buses and also the ...
introduced a second version of segmentation in 1982 that added support for
virtual memory
In computing, virtual memory, or virtual storage is a memory management technique that provides an "idealized abstraction of the storage resources that are actually available on a given machine" which "creates the illusion to users of a very ...
and
memory protection
Memory protection is a way to control memory access rights on a computer, and is a part of most modern instruction set architectures and operating systems. The main purpose of memory protection is to prevent a process from accessing memory that h ...
. At this point the original mode was renamed to
real mode
Real mode, also called real address mode, is an operating mode of all x86-compatible CPUs. The mode gets its name from the fact that addresses in real mode always correspond to real locations in memory. Real mode is characterized by a 20-bit s ...
, and the new version was named
protected mode
In computing, protected mode, also called protected virtual address mode, is an operational mode of x86-compatible central processing units (CPUs). It allows system software to use features such as virtual memory, paging and safe multi-taskin ...
. The
x86-64
x86-64 (also known as x64, x86_64, AMD64, and Intel 64) is a 64-bit version of the x86 instruction set, first released in 1999. It introduced two new modes of operation, 64-bit mode and compatibility mode, along with a new 4-level paging ...
architecture, introduced in 2003, has largely dropped support for segmentation in 64-bit mode.
In both real and protected modes, the system uses 16-bit ''segment registers'' to derive the actual memory address. In real mode, the registers CS, DS, SS, and ES point to the currently used program
code segment
In computing, a code segment, also known as a text segment or simply as text, is a portion of an object file or the corresponding section of the program's virtual address space that contains executable instructions.
Segment
The term "segment ...
(CS), the current
data segment
In computing, a data segment (often denoted .data) is a portion of an object file or the corresponding address space of a program that contains initialized static variables, that is, global variables and static local variables. The size of this seg ...
(DS), the current
stack segment (SS), and one ''extra'' segment determined by the programmer (ES). The
Intel 80386
The Intel 386, originally released as 80386 and later renamed i386, is a 32-bit microprocessor introduced in 1985. The first versions had 275,000 transistors[real mode
Real mode, also called real address mode, is an operating mode of all x86-compatible CPUs. The mode gets its name from the fact that addresses in real mode always correspond to real locations in memory. Real mode is characterized by a 20-bit s ...]
or
V86 mode, the size of a segment can range from 1
byte
The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit ...
up to 65,536 bytes (using 16-bit offsets).
The 16-bit segment selector in the segment register is interpreted as the most significant 16 bits of a linear 20-bit address, called a segment address, of which the remaining four least significant bits are all zeros. The segment address is always added to a 16-bit offset in the instruction to yield a ''linear'' address, which is the same as
physical address
In computing, a physical address (also real address, or binary address), is a memory address that is represented in the form of a binary number on the address bus circuitry in order to enable the data bus to access a ''particular'' storage cel ...
in this mode. For instance, the segmented address 06EFh:1234h (here the suffix "h" means
hexadecimal
In mathematics and computing, the hexadecimal (also base-16 or simply hex) numeral system is a positional numeral system that represents numbers using a radix (base) of 16. Unlike the decimal system representing numbers using 10 symbols, h ...
) has a segment selector of 06EFh, representing a segment address of 06EF0h, to which the offset is added, yielding the linear address 06EF0h + 1234h = 08124h.
Because of the way the segment address and offset are added, a single linear address can be mapped to up to 2
12 = 4096 distinct segment:offset pairs. For example, the linear address 08124h can have the segmented addresses 06EFh:1234h, 0812h:0004h, 0000h:8124h, etc.
This could be confusing to programmers accustomed to unique addressing schemes, but it can also be used to advantage, for example when addressing multiple nested data structures. While real mode segments are always 64
KB long, the practical effect is only that no segment can be longer than 64 KB, rather than that every segment ''must'' be 64 KB long. Because there is no protection or privilege limitation in real mode, even if a segment could be defined to be smaller than 64 KB, it would still be entirely up to the programs to coordinate and keep within the bounds of their segments, as any program can always access any memory (since it can arbitrarily set segment selectors to change segment addresses with absolutely no supervision). Therefore, real mode can just as well be imagined as having a variable length for each segment, in the range 1 to 65,536 bytes, that is just not enforced by the CPU.
(The leading zeros of the linear address, segmented addresses, and the segment and offset fields are shown here for clarity. They are usually omitted.)
The effective 20-bit
address space
In computing, an address space defines a range of discrete addresses, each of which may correspond to a network host, peripheral device, disk sector, a memory cell or other logical or physical entity.
For software programs to save and retrieve ...
of real mode limits the
addressable memory to 2
20 bytes, or 1,048,576 bytes (1
MB). This derived directly from the hardware design of the Intel 8086 (and, subsequently, the closely related 8088), which had exactly 20
address pins. (Both were packaged in 40-pin DIP packages; even with only 20 address lines, the address and data buses were multiplexed to fit all the address and data lines within the limited pin count.)
Each segment begins at a multiple of 16 bytes, called a ''paragraph'', from the beginning of the linear (flat) address space. That is, at 16 byte intervals. Since all segments are 64 KB long, this explains how overlap can occur between segments and why any location in the linear memory address space can be accessed with many segment:offset pairs. The actual location of the beginning of a segment in the linear address space can be calculated with segment×16. A segment value of 0Ch (12) would give a linear address at C0h (192) in the linear address space. The address offset can then be added to this number. 0Ch:0Fh (12:15) would be C0h+0Fh=CFh (192+15=207), CFh (207) being the linear address. Such address translations are carried out by the segmentation unit of the CPU. The last segment, FFFFh (65535), begins at linear address FFFF0h (1048560), 16 bytes before the end of the 20 bit address space, and thus, can access, with an offset of up to 65,536 bytes, up to 65,520 (65536−16) bytes past the end of the 20 bit 8088 address space. On the 8088, these address accesses were wrapped around to the beginning of the address space such that 65535:16 would access address 0 and 65533:1000 would access address 952 of the linear address space. The use of this feature by programmers led to the
Gate A20 compatibility issues in later CPU generations, where the linear address space was expanded past 20 bits.
In 16-bit real mode, enabling applications to make use of multiple memory segments (in order to access more memory than available in any one 64K-segment) is quite complex, but was viewed as a necessary evil for all but the smallest tools (which could do with less memory). The root of the problem is that no appropriate address-arithmetic instructions suitable for flat addressing of the entire memory range are available. Flat addressing is possible by applying multiple instructions, which however leads to slower programs.
The ''
memory model'' concept derives from the setup of the segment registers. For example, in the ''tiny model'' CS=DS=SS, that is the program's code, data, and stack are all contained within a single 64 KB segment. In the ''small'' memory model DS=SS, so both data and stack reside in the same segment; CS points to a different code segment of up to 64 KB.
Protected mode
80286 protected mode
The
80286
The Intel 80286 (also marketed as the iAPX 286 and often called Intel 286) is a 16-bit microprocessor that was introduced on February 1, 1982. It was the first 8086-based CPU with separate, non- multiplexed address and data buses and also th ...
's
protected mode
In computing, protected mode, also called protected virtual address mode, is an operational mode of x86-compatible central processing units (CPUs). It allows system software to use features such as virtual memory, paging and safe multi-taskin ...
extends the processor's address space to 2
24 bytes (16 megabytes), but not by adjusting the shift value. Instead, the 16-bit segment registers now contain an index into a table of
segment descriptors containing 24-bit base addresses to which the offset is added. To support old software, the processor starts up in "real mode", a mode in which it uses the segmented addressing model of the 8086. There is a small difference though: the resulting physical address is no longer truncated to 20 bits, so
real mode
Real mode, also called real address mode, is an operating mode of all x86-compatible CPUs. The mode gets its name from the fact that addresses in real mode always correspond to real locations in memory. Real mode is characterized by a 20-bit s ...
pointers (but not 8086 pointers) can now refer to addresses between 100000
16 and 10FFEF
16. This roughly 64-kilobyte region of memory was known as the
High Memory Area (HMA), and later versions of
DOS could use it to increase the available "conventional" memory (i.e. within the first
MB). With the addition of the HMA, the total address space is approximately 1.06 MB. Though the 80286 does not truncate real-mode addresses to 20 bits, a system containing an 80286 can do so with hardware external to the processor, by gating off the 21st address line, the
A20 line. The IBM PC AT provided the hardware to do this (for full backward compatibility with software for the original
IBM PC and
PC/XT models), and so all subsequent "
AT-class" PC clones did as well.
286 protected mode was seldom used as it would have excluded the large body of users with 8086/88 machines. Moreover, it still necessitated dividing memory into 64k segments like was done in real mode. This limitation can be worked around on 32-bit CPUs which permit the use of memory pointers greater than 64k in size, however as the Segment Limit field is only 24-bit long, the maximum segment size that can be created is 16MB (although paging can be used to allocate more memory, no individual segment may exceed 16MB). This method was commonly used on Windows 3.x applications to produce a flat memory space, although as the OS itself was still 16-bit, API calls could not be made with 32-bit instructions. Thus, it was still necessary to place all code that performs API calls in 64k segments.
Once 286 protected mode is invoked, it could not be exited except by performing a hardware reset. Machines following the rising
IBM PC/AT
The IBM Personal Computer/AT (model 5170, abbreviated as IBM AT or PC/AT) was released in 1984 as the fourth model in the IBM Personal Computer line, following the IBM PC/XT and its IBM Portable PC variant. It was designed around the Intel 80 ...
standard could feign a reset to the CPU via the standardised keyboard controller, but this was significantly sluggish. Windows 3.x worked around both of these problems by intentionally triggering a