HOME

TheInfoList



OR:

A disassembler is a
computer program A computer program is a sequence or set of instructions in a programming language for a computer to Execution (computing), execute. It is one component of software, which also includes software documentation, documentation and other intangibl ...
that translates
machine language In computer programming, machine code is computer code consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). For conventional binary computers, machine code is the binaryOn nonb ...
into
assembly language In computing, assembly language (alternatively assembler language or symbolic machine code), often referred to simply as assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence bet ...
—the inverse operation to that of an assembler. The output of disassembly is typically formatted for human-readability rather than for input to an assembler, making disassemblers primarily a
reverse-engineering Reverse engineering (also known as backwards engineering or back engineering) is a process or method through which one attempts to understand through deductive reasoning how a previously made device, process, system, or piece of software accompl ...
tool. Common uses include analyzing the output of
high-level programming language A high-level programming language is a programming language with strong Abstraction (computer science), abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language ''elements'', be ea ...
compiler In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
s and their optimizations, recovering
source code In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer. Since a computer, at base, only ...
when the original is lost, performing malware analysis, modifying software (such as binary patching), and
software cracking Software cracking (known as "breaking" mostly in the 1980s) is an act of removing copy protection from a software. Copy protection can be removed by applying a specific ''crack''. A ''crack'' can mean any tool that enables breaking software p ...
. A disassembler differs from a
decompiler A decompiler is a computer program that translates an executable file back into high-level source code. Unlike a compiler, which converts high-level code into machine code, a decompiler performs the reverse process. While disassemblers translate e ...
, which targets a
high-level language A high-level programming language is a programming language with strong abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language ''elements'', be easier to use, or may automate (or ...
rather than an assembly language. Assembly language
source code In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer. Since a computer, at base, only ...
generally permits the use of constants and programmer comments. These are usually removed from the assembled
machine code In computer programming, machine code is computer code consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). For conventional binary computers, machine code is the binaryOn nonb ...
by the assembler. If so, a disassembler operating on the machine code would produce disassembly lacking these constants and comments; the disassembled output becomes more difficult for a human to interpret than the original annotated source code. Some disassemblers provide a built-in code commenting feature where the generated output is enriched with comments regarding called API functions or parameters of called functions. Some disassemblers make use of the symbolic debugging information present in object files such as
ELF An elf (: elves) is a type of humanoid supernatural being in Germanic peoples, Germanic folklore. Elves appear especially in Norse mythology, North Germanic mythology, being mentioned in the Icelandic ''Poetic Edda'' and the ''Prose Edda'' ...
. For example, IDA allows the human user to make up mnemonic symbols for values or regions of code in an interactive session: human insight applied to the disassembly process often parallels human creativity in the code writing process.


Challenges

It is not always possible to distinguish executable code from data within a binary. While common executable formats, such as
ELF An elf (: elves) is a type of humanoid supernatural being in Germanic peoples, Germanic folklore. Elves appear especially in Norse mythology, North Germanic mythology, being mentioned in the Icelandic ''Poetic Edda'' and the ''Prose Edda'' ...
and PE, separate code and data into distinct sections, flat binaries do not, making it unclear whether a given location contains executable instructions or non-executable data. This ambiguity might complicate the disassembly process. Additionally, CPUs often allow dynamic jumps computed at runtime, which makes it impossible to identify all possible locations in the binary that might be executed as instructions. On computer architectures with variable-width instructions, such as in many CISC architectures, more than one valid disassembly may exist for the same binary. Disassemblers also cannot handle code that changes during execution, as static analysis cannot account for runtime modifications. Encryption, packing, or
obfuscation Obfuscation is the obscuring of the intended meaning of communication by making the message difficult to understand, usually with confusing and ambiguous language. The obfuscation might be either unintentional or intentional (although intent ...
are often applied to computer programs, especially as part of
digital rights management Digital rights management (DRM) is the management of legal access to digital content. Various tools or technological protection measures, such as access control technologies, can restrict the use of proprietary hardware and copyrighted works. DRM ...
to deter reverse engineering and cracking. These techniques pose additional challenges for disassembly, as the code must first be unpacked or decrypted before meaningful analysis can begin.


Examples of disassemblers

A disassembler can be either stand-alone or interactive. A stand-alone disassembler generates an assembly language file upon execution, which can then be examined. In contrast, an interactive disassembler immediately reflects any changes made by the user. For example, if the disassembler initially treats a section of the program as data rather than code, the user can specify it as code. The disassembled code will then be updated and displayed instantly, allowing the user to analyze it and make further changes during the same session. Any interactive
debugger A debugger is a computer program used to test and debug other programs (the "target" programs). Common features of debuggers include the ability to run or halt the target program using breakpoints, step through code line by line, and display ...
will include some way of viewing the disassembly of the program being debugged. Often, the same disassembly tool will be packaged as a standalone disassembler distributed along with the debugger. For example, objdump, part of GNU Binutils, is related to the interactive debugger gdb. * Binary Ninja *
DEBUG In engineering, debugging is the process of finding the root cause, workarounds, and possible fixes for bugs. For software, debugging tactics can involve interactive debugging, control flow analysis, log file analysis, monitoring at the ap ...
* Interactive Disassembler (IDA) * Ghidra * Hiew * Hopper Disassembler * PE Explorer Disassembler * Netwide Disassembler (Ndisasm), companion to the Netwide Assembler (NASM). * OLIVER (
CICS IBM CICS (Customer Information Control System) is a family of mixed-language application servers that provide online business transaction management, transaction management and connectivity for applications on IBM mainframe systems under z/OS ...
interactive test/debug) includes disassemblers for Assembler,
COBOL COBOL (; an acronym for "common business-oriented language") is a compiled English-like computer programming language designed for business use. It is an imperative, procedural, and, since 2002, object-oriented language. COBOL is primarily ...
, and PL/1 *
x64dbg x64dbg is a free and open-source debugging software available on Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Micros ...
, a debugger for Windows that also performs dynamic disassembly * OllyDbg is a 32-bit assembler level analysing debugger *
Radare2 Radare2 (also known as r2) is a complete framework for reverse-engineering and analyzing binaries; composed of a set of small utilities that can be used together or independently from the command line. Built around a disassembler for computer s ...
* Rizin and Cutter (graphical interface for Rizin) * SIMON (batch interactive test/debug) includes disassemblers for Assembler, COBOL, and PL/1 * Sourcer, a commenting 16-bit/32-bit disassembler for DOS,
OS/2 OS/2 is a Proprietary software, proprietary computer operating system for x86 and PowerPC based personal computers. It was created and initially developed jointly by IBM and Microsoft, under the leadership of IBM software designer Ed Iacobucci, ...
and
Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
by V Communications in the 1990s


Disassemblers and emulators

A dynamic disassembler can be integrated into the output of an
emulator In computing, an emulator is Computer hardware, hardware or software that enables one computer system (called the ''host'') to behave like another computer system (called the ''guest''). An emulator typically enables the host system to run sof ...
or
hypervisor A hypervisor, also known as a virtual machine monitor (VMM) or virtualizer, is a type of computer software, firmware or hardware that creates and runs virtual machines. A computer on which a hypervisor runs one or more virtual machines is called ...
to trace the real-time execution of machine instructions, displaying them line-by-line. In this setup, along with the disassembled machine code, the disassembler can show changes to registers, data, or other state elements (such as condition codes) caused by each instructions. This provides powerful debugging information for problem resolution. However, the output size can become quite large, particularly if the tracing is active throughout the entire execution of a program. These features were first introduced in the early 1970s by OLIVER as part of its
CICS IBM CICS (Customer Information Control System) is a family of mixed-language application servers that provide online business transaction management, transaction management and connectivity for applications on IBM mainframe systems under z/OS ...
debugging product and are now incorporated into the XPEDITER product from Compuware.


Length disassembler

A length disassembler, also known as length disassembler engine (LDE), is a tool that, given a sequence of bytes (instructions), outputs the number of bytes taken by the parsed instruction. Notable open source projects for the x86 architecture include ldisasm, Tiny x86 Length Disassembler and Extended Length Disassembler Engine for x86-64.


See also

* Control-flow graph *
Data-flow analysis Data-flow analysis is a technique for gathering information about the possible set of values calculated at various points in a computer program. It forms the foundation for a wide variety of compiler optimizations and program verification techn ...
*
Decompiler A decompiler is a computer program that translates an executable file back into high-level source code. Unlike a compiler, which converts high-level code into machine code, a decompiler performs the reverse process. While disassemblers translate e ...


References


Further reading

* *


External links

* List of x86 disassemblers in Wikibooks
Transformation Wiki on disassembly

Boomerang
A general, open source, retargetable decompiler of machine code programs
Online Disassembler
, a free online disassembler of arms, mips, ppc, and x86 code {{Authority control Debugging Reverse engineering