PE Format
   HOME

TheInfoList



OR:

The Portable Executable (PE) format is a
file format A file format is a Computer standard, standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a digital storage medium. File formats may be either proprietary format, pr ...
for
executable In computer science, executable code, an executable file, or an executable program, sometimes simply referred to as an executable or binary, causes a computer "to perform indicated tasks according to encoded instruction (computer science), in ...
s,
object code In computing, object code or object module is the product of an assembler or compiler In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' ...
, dynamic-link-libraries (DLLs), and binary files used on 32-bit and 64-bit
Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
operating system An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ...
s, as well as in
UEFI Unified Extensible Firmware Interface (UEFI, as an acronym) is a Specification (technical standard), specification for the firmware Software architecture, architecture of a computing platform. When a computer booting, is powered on, the UEFI ...
environments. It is the standard format for executables on Windows NT-based systems, including files such as .exe, .dll, .sys (for system drivers), and .mui. At its core, the PE format is a structured data container that gives the Windows operating system loader everything it needs to properly manage the
executable code In computer science, executable code, an executable file, or an executable program, sometimes simply referred to as an executable or binary, causes a computer "to perform indicated tasks according to encoded instructions", as opposed to a d ...
it contains. This includes references for dynamically linked libraries, tables for importing and exporting APIs, resource management data and
thread-local storage In computer programming, thread-local storage (TLS) is a memory management method that uses static memory allocation, static or global computer storage, memory local to a thread (computing), thread. The concept allows storage of data that appear ...
(TLS) information. According to the Unified Extensible Firmware Interface (UEFI) specification, the PE format is also the accepted standard for executables in EFI environments. On Windows NT systems, it currently supports a range of instruction sets, including
IA-32 IA-32 (short for "Intel Architecture, 32-bit", commonly called ''i386'') is the 32-bit version of the x86 instruction set architecture, designed by Intel and first implemented in the i386, 80386 microprocessor in 1985. IA-32 is the first incarn ...
,
x86-64 x86-64 (also known as x64, x86_64, AMD64, and Intel 64) is a 64-bit extension of the x86 instruction set architecture, instruction set. It was announced in 1999 and first available in the AMD Opteron family in 2003. It introduces two new ope ...
(AMD64/Intel 64),
IA-64 IA-64 (Intel Itanium architecture) is the instruction set architecture (ISA) of the discontinued Itanium family of 64-bit Intel microprocessors. The basic ISA specification originated at Hewlett-Packard (HP), and was subsequently implemented by ...
,
ARM In human anatomy, the arm refers to the upper limb in common usage, although academically the term specifically means the upper arm between the glenohumeral joint (shoulder joint) and the elbow joint. The distal part of the upper limb between ...
and
ARM64 AArch64, also known as ARM64, is a 64-bit version of the ARM architecture family, a widely used set of computer processor designs. It was introduced in 2011 with the ARMv8 architecture and later became part of the ARMv9 series. AArch64 allows ...
. Before the advent of
Windows 2000 Windows 2000 is a major release of the Windows NT operating system developed by Microsoft, targeting the server and business markets. It is the direct successor to Windows NT 4.0, and was Software release life cycle#Release to manufacturing (RT ...
, Windows NT (and by extension the PE format) also supported MIPS,
Alpha Alpha (uppercase , lowercase ) is the first letter of the Greek alphabet. In the system of Greek numerals, it has a value of one. Alpha is derived from the Phoenician letter ''aleph'' , whose name comes from the West Semitic word for ' ...
, and
PowerPC PowerPC (with the backronym Performance Optimization With Enhanced RISC – Performance Computing, sometimes abbreviated as PPC) is a reduced instruction set computer (RISC) instruction set architecture (ISA) created by the 1991 Apple Inc., App ...
architectures. Moreover, thanks to its use in
Windows CE Windows CE, later known as Windows Embedded CE and Windows Embedded Compact, is a discontinued operating system developed by Microsoft for mobile and embedded devices. It was part of the Windows Embedded family and served as the software foun ...
, PE has maintained compatibility with several MIPS,
ARM In human anatomy, the arm refers to the upper limb in common usage, although academically the term specifically means the upper arm between the glenohumeral joint (shoulder joint) and the elbow joint. The distal part of the upper limb between ...
(including
Thumb The thumb is the first digit of the hand, next to the index finger. When a person is standing in the medical anatomical position (where the palm is facing to the front), the thumb is the outermost digit. The Medical Latin English noun for thumb ...
), and
SuperH SuperH (or SH) is a 32-bit reduced instruction set computing (RISC) instruction set architecture (ISA) developed by Hitachi and currently produced by Renesas. It is implemented by microcontrollers and microprocessors for embedded systems. At the ...
variants. Functionally, the PE format is similar to other platform-specific executable formats, such as the
ELF An elf (: elves) is a type of humanoid supernatural being in Germanic peoples, Germanic folklore. Elves appear especially in Norse mythology, North Germanic mythology, being mentioned in the Icelandic ''Poetic Edda'' and the ''Prose Edda'' ...
format used in
Linux Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
and most Unix-like systems, and the Mach-O format found in
macOS macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...
and
iOS Ios, Io or Nio (, ; ; locally Nios, Νιός) is a Greek island in the Cyclades group in the Aegean Sea. Ios is a hilly island with cliffs down to the sea on most sides. It is situated halfway between Naxos and Santorini. It is about long an ...
.


History

Microsoft first introduced the PE format with
Windows NT 3.1 Windows NT 3.1 is the first major release of the Windows NT operating system developed by Microsoft, released on July 27, 1993. It marked the company's entry into the corporate computing environment, designed to support large networks and to be ...
, replacing the older 16-bit
New Executable The New Executable (NE or NewEXE) is a 16-bit executable file format, a successor to the DOS MZ executable format. It was used in Windows 1.0–3.x, Windows 9x, multitasking MS-DOS 4.0, OS/2 1.x, and the OS/2 subset of Windows NT up to versio ...
(NE) format. Soon after, Windows 95, 98, ME, and the Win32s extension for
Windows 3.1x Windows 3.1 is a major release of Microsoft Windows. It was released to manufacturing on April 6, 1992, as a successor to Windows 3.0. Like its predecessors, the Windows 3.1 series run as a shell on top of MS-DOS; it was the last Windows 16 ...
, all adopted the PE structure. Each PE file includes a DOS executable header, which generally displays the message " This program cannot be run in DOS mode". However, this DOS section can be replaced by a fully functional DOS program, as demonstrated in the Windows 98 SE installer. Developers can add such a program using the /STUB switch with Microsoft's linker, effectively creating a
fat binary A fat binary (or multiarchitecture binary) is a computer executable program or library which has been expanded (or "fattened") with code native to multiple instruction sets which can consequently be run on multiple processor types. This results ...
. Over time, the PE format has grown with the Windows platform. Notable extensions include the
.NET The .NET platform (pronounced as "''dot net"'') is a free and open-source, managed code, managed computer software framework for Microsoft Windows, Windows, Linux, and macOS operating systems. The project is mainly developed by Microsoft emplo ...
PE format for
managed code Managed code is computer program code that requires and will execute only under the management of a Common Language Infrastructure (CLI); Virtual Execution System (VES); virtual machine, e.g. .NET, CoreFX, or .NET Framework; Common Language R ...
, PE32+ for 64-bit address space support, and a specialized version for
Windows CE Windows CE, later known as Windows Embedded CE and Windows Embedded Compact, is a discontinued operating system developed by Microsoft for mobile and embedded devices. It was part of the Windows Embedded family and served as the software foun ...
. To determine whether a PE file is intended for 32-bit or 64-bit architectures, one can examine the Machine field in the IMAGE_FILE_HEADER. Common machine values are 0x014c for 32-bit Intel processors and 0x8664 for x64 processors. Additionally, the Magic field in the IMAGE_OPTIONAL_HEADER reveals whether addresses are 32-bit or 64-bit. A value of 0x10B indicates a 32-bit (PE32) file, while 0x20B indicates a 64-bit (PE32+) file.


Technical details


Layout

A PE file consists of several headers and sections that instruct the
dynamic linker In computing, a dynamic linker is the part of an operating system that loads and links the shared libraries needed by an executable when it is executed (at " run time"), by copying the content of libraries from persistent storage to RAM, fill ...
about on how to map the file into memory. An executable image consists of several different regions, each requiring different
memory protection Memory protection is a way to control memory access rights on a computer, and is a part of most modern instruction set architectures and operating systems. The main purpose of memory protection is to prevent a process from accessing memory that h ...
attributes. To ensure proper alignment, the start of each section must align to a page boundary. For instance, the ''.text'' section, which contains program code, is typically mapped as an execute/read-only. Conversely, the ''.data'' section, which holds global variables, is mapped as no-execute/read write. However, to conserve space, sections are not aligned on disk in this manner. The dynamic linker maps each section to memory individually and assigns the correct permissions based on the information in the headers.


Import table

The ''import address table'' (IAT) is used as a lookup table when the application calls a function in a different module. The
imports An importer is the receiving country in an export from the sending country. Importation and exportation are the defining financial transactions of international trade. Import is part of the International Trade which involves buying and receivin ...
can be specified by ordinal or by name. Because a compiled program cannot know the memory locations of its dependent libraries beforehand, an indirect jump is necessary for API calls. As the dynamic linker holds modules and resolves dependencies, it populates the IAT slots with actual addresses of the corresponding library functions. Although this adds an extra jump, incurring a performance penalty compared to intermodular calls, it minimizes the number of memory pages that that require
copy-on-write Copy-on-write (COW), also called implicit sharing or shadowing, is a resource-management technique used in programming to manage shared data efficiently. Instead of copying data right away when multiple programs use it, the same data is shared ...
changes, thus conserving memory and disk I/O. If a call is known to be intermodular beforehand (if indicated by a dllimport attribute), the compiler can generate optimized code with a simple indirect call
opcode In computing, an opcode (abbreviated from operation code) is an enumerated value that specifies the operation to be performed. Opcodes are employed in hardware devices such as arithmetic logic units (ALUs), central processing units (CPUs), and ...
.


Address Space Layout Randomization (ASLR)

PE files aren't position-independent by default; they are compiled to run at a specific, fixed memory address. Modern operating systems use Address Space Layout Randomization (
ASLR Address space layout randomization (ASLR) is a computer security technique involved in preventing exploitation of memory corruption vulnerabilities. In order to prevent an attacker from reliably redirecting code execution to, for example, a pa ...
) to make it harder for attackers to exploit memory-related vulnerabilities. ASLR works by randomly changing the memory address of important parts of the program every time it's loaded. This includes the base address of the program itself, shared libraries (DLLs), and memory areas like the heap and stack. ASLR rearranges the address space positions of key data areas of a process, including the base of the
executable In computer science, executable code, an executable file, or an executable program, sometimes simply referred to as an executable or binary, causes a computer "to perform indicated tasks according to encoded instruction (computer science), in ...
and the positions of the
stack Stack may refer to: Places * Stack Island, an island game reserve in Bass Strait, south-eastern Australia, in Tasmania’s Hunter Island Group * Blue Stack Mountains, in Co. Donegal, Ireland People * Stack (surname) (including a list of people ...
, heap and
libraries A library is a collection of Book, books, and possibly other Document, materials and Media (communication), media, that is accessible for use by its members and members of allied institutions. Libraries provide physical (hard copies) or electron ...
. By randomizing these memory addresses each time an application is loaded, ASLR prevents attackers from being able to reliably predict memory locations.


.NET, metadata, and the PE format

In a .NET executable, the PE code section contains a stub that invokes the CLR virtual machine startup entry, _CorExeMain or _CorDllMain in mscoree.dll, much like it was in
Visual Basic Visual Basic is a name for a family of programming languages from Microsoft. It may refer to: * Visual Basic (.NET), the current version of Visual Basic launched in 2002 which runs on .NET * Visual Basic (classic), the original Visual Basic suppo ...
executables. The virtual machine then makes use of .NET metadata present, the root of which, IMAGE_COR20_HEADER (also called "CLR header") is pointed to by IMAGE_DIRECTORY_ENTRY_COMHEADER (the entry was previously used for COM+ metadata in COM+ applications, hence the name) entry in the PE header's data directory. IMAGE_COR20_HEADER strongly resembles PE's optional header, essentially playing its role for the CLR loader. The CLR-related data, including the root structure itself, is typically contained in the common code section, .text. It is composed of a few directories: metadata, embedded resources, strong names and a few for native-code interoperability. Metadata directory is a set of tables that list all the distinct .NET entities in the assembly, including types, methods, fields, constants, events, as well as references between them and to other assemblies.


Use on other operating systems

The PE format is also used by
ReactOS ReactOS is a Free and open-source software, free and open-source operating system for i586/amd64 personal computers that is intended to be binary-code compatibility, binary-compatible with computer programs and device drivers developed for Wind ...
, an open-source operating system created to be binary-compatible with Windows. Historically, it has also been used by other operating systems such as SkyOS and
BeOS BeOS is a discontinued operating system for personal computers that was developed by Be Inc. It was conceived for the company's BeBox personal computer which was released in 1995. BeOS was designed for multitasking, multithreading, and a graph ...
R3. However, both SkyOS and BeOS eventually moved to
ELF An elf (: elves) is a type of humanoid supernatural being in Germanic peoples, Germanic folklore. Elves appear especially in Norse mythology, North Germanic mythology, being mentioned in the Icelandic ''Poetic Edda'' and the ''Prose Edda'' ...
. The Mono development platform, which aims to be binary compatible with the Microsoft .NET Framework, uses the same PE format as the Microsoft implementation. The same goes for Microsoft's own cross-platform .NET Core. On
x86 x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel, based on the 8086 microprocessor and its 8-bit-external-bus variant, the 8088. Th ...
(-64)
Unix-like A Unix-like (sometimes referred to as UN*X, *nix or *NIX) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Uni ...
operating systems, Windows binaries (in PE format) can be executed using
Wine Wine is an alcoholic drink made from Fermentation in winemaking, fermented fruit. Yeast in winemaking, Yeast consumes the sugar in the fruit and converts it to ethanol and carbon dioxide, releasing heat in the process. Wine is most often made f ...
. The HX DOS Extender also uses the PE format for native DOS 32-bit binaries, and can execute some Windows binaries in DOS, thus acting like an equivalent of Wine for DOS. Mac OS X 10.5 has the ability to load and parse PE files, although it does not maintain binary compatibility with Windows.
UEFI Unified Extensible Firmware Interface (UEFI, as an acronym) is a Specification (technical standard), specification for the firmware Software architecture, architecture of a computing platform. When a computer booting, is powered on, the UEFI ...
and EFI firmware use PE files as well as the Windows ABI x64
calling convention In computer science, a calling convention is an implementation-level (low-level) scheme for how subroutines or functions receive parameters from their caller and how they return a result. When some code calls a function, design choices have been ...
for
applications Application may refer to: Mathematics and computing * Application software, computer software designed to help the user to perform specific tasks ** Application layer, an abstraction layer that specifies protocols and interface methods used in a ...
.


See also

* a.out *
Comparison of executable file formats This is a comparison of binary executable file formats which, once loaded by a suitable executable loader, can be directly executed by the CPU rather than being interpreted by software. In addition to the binary application code, the executables ...
*
Executable compression Executable compression is any means of compressing an executable file and combining the compressed data with decompression code into a single executable. When this compressed executable is executed, the decompression code recreates the original ...
*
ar (Unix) ar, short for archiver, is a shell command for maintaining multiple files as a single archive file. Originally developed for Unix, it is widely available on Unix-based systems, and similar commands are available on other platforms. Today, ar ...
since all COFF libraries use that same format *
Application virtualization Application virtualization is a software technology that encapsulates computer programs from the underlying operating system on which they are executed. A fully virtualized application is not installed in the traditional sense, although it is sti ...


References


External links


PE Format
(latest online document, changes in time) * Microsoft Portable Executable and Common Object File Format Specification
Revision 11.0, Jan 2017Revision 10.0, Jun 2016Revision 8.3, Feb 2013Revision 8.2, Sep 2010Revision 8.1, Feb 2008Revision 8.0, May 2006Revision 6.0, Feb 1999

Tool Interface Standard (TIS) Formats Specifications for Windows Version 1.0
(Intel Order Number 241597, TIS Committee, Feb 1993)

(Micheal J. O'Leary, Microsoft Developer Support)
Peering Inside the PE: A Tour of the Win32 Portable Executable File Format
Matt Pietrek Matt Pietrek (born 1966) is an American spirits and drinks writer, publisher, and rum historian. His 2022 book, ''Modern Caribbean Rum'' won the Tales of the Cocktail Spirited Award for Best New Book on Drinks Culture, History, or Spirits.
, Microsoft Systems Journal, March 1994 * An In-Depth Look into the Win32 Portable Executable File Format.
Matt Pietrek Matt Pietrek (born 1966) is an American spirits and drinks writer, publisher, and rum historian. His 2022 book, ''Modern Caribbean Rum'' won the Tales of the Cocktail Spirited Award for Best New Book on Drinks Culture, History, or Spirits.
,
MSDN Microsoft Developer Network (MSDN) was the division of Microsoft responsible for managing the firm's relationship with developers and testers, such as hardware developers interested in the operating system (OS), and software developers developing ...
Magazine
Part I, February 2002Part II, March 2002



Ero Carrera's blog describing the PE header and how to walk through

PE Internals provides an easy way to learn the Portable Executable File Format

PE Explorer
{{Executables Executable file formats Windows administration