Disassembler
   HOME

TheInfoList



OR:

A disassembler is a
computer program A computer program is a sequence or set of instructions in a programming language for a computer to Execution (computing), execute. It is one component of software, which also includes software documentation, documentation and other intangibl ...
that
translate Translation is the communication of the meaning of a source-language text by means of an equivalent target-language text. The English language draws a terminological distinction (which does not exist in every language) between ''transla ...
s
machine language In computer programming, machine code is computer code consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). For conventional binary computers, machine code is the binaryOn nonb ...
into
assembly language In computing, assembly language (alternatively assembler language or symbolic machine code), often referred to simply as assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence bet ...
—the inverse operation to that of an assembler. The output of disassembly is typically formatted for human-readability rather than for input to an assembler, making disassemblers primarily a
reverse-engineering Reverse engineering (also known as backwards engineering or back engineering) is a process or method through which one attempts to understand through deductive reasoning how a previously made device, process, system, or piece of software accompl ...
tool. Common uses include analyzing the output of
high-level programming language A high-level programming language is a programming language with strong Abstraction (computer science), abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language ''elements'', be ea ...
compiler In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
s and their optimizations, recovering
source code In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer. Since a computer, at base, only ...
when the original is lost, performing
malware analysis Malware analysis is the study or process of determining the functionality, origin and potential impact of a given malware sample such as a virus, worm, trojan horse, rootkit, or backdoor. Malware or malicious software is any computer software inte ...
, modifying software (such as binary patching), and
software cracking Software cracking (known as "breaking" mostly in the 1980s) is an act of removing copy protection from a software. Copy protection can be removed by applying a specific ''crack''. A ''crack'' can mean any tool that enables breaking software p ...
. A disassembler differs from a
decompiler A decompiler is a computer program that translates an executable file back into high-level source code. Unlike a compiler, which converts high-level code into machine code, a decompiler performs the reverse process. While disassemblers translate e ...
, which targets a
high-level language A high-level programming language is a programming language with strong abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language ''elements'', be easier to use, or may automate (or ...
rather than an assembly language. Assembly language
source code In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer. Since a computer, at base, only ...
generally permits the use of constants and programmer
comment Comment may refer to: Computing * Comment (computer programming), explanatory text or information embedded in the source code of a computer program * Comment programming, a software development technique based on the regular use of comment tags ...
s. These are usually removed from the assembled
machine code In computer programming, machine code is computer code consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). For conventional binary computers, machine code is the binaryOn nonb ...
by the assembler. If so, a disassembler operating on the machine code would produce disassembly lacking these constants and comments; the disassembled output becomes more difficult for a human to interpret than the original annotated source code. Some disassemblers provide a built-in code commenting feature where the generated output is enriched with comments regarding called API functions or parameters of called functions. Some disassemblers make use of the
symbolic debugging A debug symbol is a special kind of symbol (computing), symbol that attaches additional information to the symbol table of an object file, such as a shared library or an executable. This information allows a symbolic debugger to gain access to in ...
information present in object files such as
ELF An elf (: elves) is a type of humanoid supernatural being in Germanic peoples, Germanic folklore. Elves appear especially in Norse mythology, North Germanic mythology, being mentioned in the Icelandic ''Poetic Edda'' and the ''Prose Edda'' ...
. For example, IDA allows the human user to make up mnemonic symbols for values or regions of code in an interactive session: human insight applied to the disassembly process often parallels human creativity in the code writing process.


Challenges

It is not always possible to distinguish executable code from data within a binary. While common executable formats, such as
ELF An elf (: elves) is a type of humanoid supernatural being in Germanic peoples, Germanic folklore. Elves appear especially in Norse mythology, North Germanic mythology, being mentioned in the Icelandic ''Poetic Edda'' and the ''Prose Edda'' ...
and PE, separate code and data into distinct sections, flat binaries do not, making it unclear whether a given location contains executable instructions or non-executable data. This ambiguity might complicate the disassembly process. Additionally, CPUs often allow dynamic jumps computed at runtime, which makes it impossible to identify all possible locations in the binary that might be executed as instructions. On computer architectures with variable-width instructions, such as in many CISC architectures, more than one valid disassembly may exist for the same binary. Disassemblers also cannot handle code that changes during execution, as static analysis cannot account for runtime modifications. Encryption, packing, or
obfuscation Obfuscation is the obscuring of the intended meaning of communication by making the message difficult to understand, usually with confusing and ambiguous language. The obfuscation might be either unintentional or intentional (although intent ...
are often applied to computer programs, especially as part of
digital rights management Digital rights management (DRM) is the management of legal access to digital content. Various tools or technological protection measures, such as access control technologies, can restrict the use of proprietary hardware and copyrighted works. DRM ...
to deter reverse engineering and
cracking Cracking may refer to: * Cracking, the formation of a fracture or partial fracture in a solid material studied as fracture mechanics ** Performing a sternotomy * Fluid catalytic cracking, a catalytic process widely used in oil refineries for crac ...
. These techniques pose additional challenges for disassembly, as the code must first be unpacked or decrypted before meaningful analysis can begin.


Examples of disassemblers

A disassembler can be either stand-alone or interactive. A stand-alone disassembler generates an assembly language file upon execution, which can then be examined. In contrast, an interactive disassembler immediately reflects any changes made by the user. For example, if the disassembler initially treats a section of the program as data rather than code, the user can specify it as code. The disassembled code will then be updated and displayed instantly, allowing the user to analyze it and make further changes during the same session. Any interactive
debugger A debugger is a computer program used to test and debug other programs (the "target" programs). Common features of debuggers include the ability to run or halt the target program using breakpoints, step through code line by line, and display ...
will include some way of viewing the disassembly of the program being debugged. Often, the same disassembly tool will be packaged as a standalone disassembler distributed along with the debugger. For example,
objdump objdump is a command-line program for displaying various information about object files on Unix-like operating systems. For instance, it can be used as a disassembler to view an executable in assembly form. It is part of the GNU Binutils for fi ...
, part of
GNU Binutils The GNU Binary Utilities, or , is a collection of programming tools maintained by the GNU Project for working with executable code including assembly, linking and many other development operations. The tools are originally from Cygnus Solut ...
, is related to the interactive debugger
gdb The GNU Debugger (GDB) is a portable debugger that runs on many Unix-like systems and works for many programming languages, including Ada, Assembly, C, C++, D, Fortran, Haskell, Go, Objective-C, OpenCL C, Modula-2, Pascal, Rust, and par ...
. * Binary Ninja *
DEBUG In engineering, debugging is the process of finding the root cause, workarounds, and possible fixes for bugs. For software, debugging tactics can involve interactive debugging, control flow analysis, log file analysis, monitoring at the ap ...
*
Interactive Disassembler The Interactive Disassembler (IDA) is a disassembler for computer software which generates assembly language source code from machine-executable code. It supports a variety of executable formats for different processors and operating systems. ...
(IDA) *
Ghidra Ghidra (pronounced GEE-druh; ) is a free and open source reverse engineering tool developed by the National Security Agency (NSA) of the United States. The binaries were released at RSA Conference in March 2019; the sources were published one ...
* Hiew * Hopper Disassembler * PE Explorer Disassembler * Netwide Disassembler (Ndisasm), companion to the
Netwide Assembler The Netwide Assembler (NASM) is an assembler and disassembler for the Intel x86 architecture. It can be used to write 16-bit, 32-bit (IA-32) and 64-bit In computer architecture, 64-bit integers, memory addresses, or other data units are ...
(NASM). * OLIVER (
CICS IBM CICS (Customer Information Control System) is a family of mixed-language application servers that provide online business transaction management, transaction management and connectivity for applications on IBM mainframe systems under z/OS ...
interactive test/debug) includes disassemblers for Assembler,
COBOL COBOL (; an acronym for "common business-oriented language") is a compiled English-like computer programming language designed for business use. It is an imperative, procedural, and, since 2002, object-oriented language. COBOL is primarily ...
, and
PL/1 PL/I (Programming Language One, pronounced and sometimes written PL/1) is a procedural, imperative computer programming language initially developed by IBM. It is designed for scientific, engineering, business and system programming. It has b ...
*
x64dbg x64dbg is a free and open-source debugging software available on Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Micros ...
, a debugger for Windows that also performs dynamic disassembly *
OllyDbg OllyDbg (named after its author, Oleh Yuschuk) is an x86 debugger that emphasizes binary code analysis, which is useful when source code is not available. It traces registers, recognizes procedures, API calls, switches, tables, constants and st ...
is a 32-bit assembler level analysing debugger *
Radare2 Radare2 (also known as r2) is a complete framework for reverse-engineering and analyzing binaries; composed of a set of small utilities that can be used together or independently from the command line. Built around a disassembler for computer s ...
* Rizin and Cutter (graphical interface for Rizin) * SIMON (batch interactive test/debug) includes disassemblers for Assembler, COBOL, and PL/1 * Sourcer, a commenting 16-bit/32-bit disassembler for
DOS DOS (, ) is a family of disk-based operating systems for IBM PC compatible computers. The DOS family primarily consists of IBM PC DOS and a rebranded version, Microsoft's MS-DOS, both of which were introduced in 1981. Later compatible syste ...
,
OS/2 OS/2 is a Proprietary software, proprietary computer operating system for x86 and PowerPC based personal computers. It was created and initially developed jointly by IBM and Microsoft, under the leadership of IBM software designer Ed Iacobucci, ...
and
Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
by V Communications in the 1990s


Disassemblers and emulators

A dynamic disassembler can be integrated into the output of an
emulator In computing, an emulator is Computer hardware, hardware or software that enables one computer system (called the ''host'') to behave like another computer system (called the ''guest''). An emulator typically enables the host system to run sof ...
or
hypervisor A hypervisor, also known as a virtual machine monitor (VMM) or virtualizer, is a type of computer software, firmware or hardware that creates and runs virtual machines. A computer on which a hypervisor runs one or more virtual machines is called ...
to trace the real-time execution of machine instructions, displaying them line-by-line. In this setup, along with the disassembled machine code, the disassembler can show changes to registers, data, or other state elements (such as condition codes) caused by each instructions. This provides powerful debugging information for problem resolution. However, the output size can become quite large, particularly if the tracing is active throughout the entire execution of a program. These features were first introduced in the early 1970s by OLIVER as part of its
CICS IBM CICS (Customer Information Control System) is a family of mixed-language application servers that provide online business transaction management, transaction management and connectivity for applications on IBM mainframe systems under z/OS ...
debugging product and are now incorporated into the XPEDITER product from
Compuware Compuware Corporation was an American software company based in Detroit. The company offered products aimed at the information technology (IT) departments of large businesses, and its services also included testing, development, automation and p ...
.


Length disassembler

A length disassembler, also known as length disassembler engine (LDE), is a tool that, given a sequence of bytes (instructions), outputs the number of bytes taken by the parsed instruction. Notable open source projects for the x86 architecture include ldisasm, Tiny x86 Length Disassembler and Extended Length Disassembler Engine for x86-64.


See also

*
Control-flow graph In computer science, a control-flow graph (CFG) is a representation, using graph notation, of all paths that might be traversed through a program during its execution. The control-flow graph was conceived by Frances E. Allen, who noted that ...
*
Data-flow analysis Data-flow analysis is a technique for gathering information about the possible set of values calculated at various points in a computer program. It forms the foundation for a wide variety of compiler optimizations and program verification techn ...
*
Decompiler A decompiler is a computer program that translates an executable file back into high-level source code. Unlike a compiler, which converts high-level code into machine code, a decompiler performs the reverse process. While disassemblers translate e ...


References


Further reading

* *


External links

* List of x86 disassemblers in Wikibooks
Transformation Wiki on disassembly

Boomerang
A general, open source, retargetable decompiler of machine code programs
Online Disassembler
, a free online disassembler of arms, mips, ppc, and x86 code {{Authority control Debugging Reverse engineering