HOME

TheInfoList



OR:

In
computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes, and development of both hardware and software. Computing has scientific, ...
, binary translation is a form of binary recompilation where sequences of instructions are translated from a ''source''
instruction set In computer science, an instruction set architecture (ISA), also called computer architecture, is an abstract model of a computer. A device that executes instructions described by that ISA, such as a central processing unit (CPU), is called an ...
to the ''target'' instruction set. In some cases such as instruction set simulation, the target instruction set may be the same as the source instruction set, providing testing and debugging features such as instruction trace, conditional breakpoints and hot spot detection. The two main types are static and dynamic binary translation. Translation can be done in hardware (for example, by circuits in a
CPU A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, a ...
) or in software (e.g. run-time engines, static recompiler, emulators).


Motivation

Binary translation is motivated by a lack of a binary for a target platform, the lack of source code to compile for the target platform, or otherwise difficulty in compiling the source for the target platform. Statically-recompiled binaries run potentially faster than their respective emulated binaries, as the emulation overhead is removed. This is similar to the difference in performance between interpreted and compiled programs in general.


Static binary translation

A translator using static binary translation aims to convert all of the code of an executable file into code that runs on the target architecture without having to run the code first, as is done in dynamic binary translation. This is very difficult to do correctly, since not all the code can be discovered by the translator. For example, some parts of the executable may be reachable only through
indirect branch An indirect branch (also known as a computed jump, indirect jump and register-indirect jump) is a type of program control instruction present in some machine language instruction sets. Rather than specifying the address of the next instructi ...
es, whose value is known only at run-time. One such static binary translator uses universal superoptimizer peephole technology (developed by Sorav Bansal and Alex Aiken from
Stanford University Stanford University, officially Leland Stanford Junior University, is a private research university in Stanford, California. The campus occupies , among the largest in the United States, and enrolls over 17,000 students. Stanford is conside ...
) to perform efficient translation between possibly many source and target pairs, with considerably low development costs and high performance of the target binary. In experiments of PowerPC-to-x86 translations, some binaries even outperformed native versions, but on average they ran at two-thirds of native speed.


Examples for static binary translations

Honeywell Honeywell International Inc. is an American publicly traded, multinational conglomerate corporation headquartered in Charlotte, North Carolina. It primarily operates in four areas of business: aerospace, building technologies, performance ma ...
provided a program called the Liberator for their Honeywell 200 series of computers; it could translate programs for the
IBM 1400 series The IBM 1400 series were second-generation (transistor) mid-range business decimal computers that IBM marketed in the early 1960s. The computers were offered to replace tabulating machines like the IBM 407. The 1400-series machines stored infor ...
of computers into programs for the Honeywell 200 series. In 2014, an
ARM architecture ARM (stylised in lowercase as arm, formerly an acronym for Advanced RISC Machines and originally Acorn RISC Machine) is a family of reduced instruction set computer (RISC) instruction set architectures for computer processors, configured ...
version of the 1998
video game Video games, also known as computer games, are electronic games that involves interaction with a user interface or input device such as a joystick, controller, keyboard, or motion sensing device to generate visual feedback. This feedba ...
''
StarCraft ''StarCraft'' is a military science fiction media franchise created by Chris Metzen and James Phinney and owned by Blizzard Entertainment. The series, set in the beginning of the 26th century, centers on a galactic struggle for dominance a ...
'' was generated by static recompilation and additional
reverse engineering Reverse engineering (also known as backwards engineering or back engineering) is a process or method through which one attempts to understand through deductive reasoning how a previously made device, process, system, or piece of software accompli ...
of the original x86 version. The
Pandora In Greek mythology, Pandora (Greek language, Greek: , derived from , ''pān'', i.e. "all" and , ''dōron'', i.e. "gift", thus "the all-endowed", "all-gifted" or "all-giving") was the first human woman created by Hephaestus on the instructions ...
handheld community was capable of developing the required tools on their own and achieving such translations successfully several times. For instance, a successful x86-to-
x64 x86-64 (also known as x64, x86_64, AMD64, and Intel 64) is a 64-bit version of the x86 instruction set, first released in 1999. It introduced two new modes of operation, 64-bit mode and compatibility mode, along with a new 4-level paging mod ...
static recompilation was generated for the procedural terrain generator of the video game ''
Cube World ''Cube World'' is an action role-playing game developed and published by Picroma for Microsoft Windows. Wolfram von Funck, the game's designer, began developing the game in June 2011, and was later joined by his wife, Sarah. An alpha version of ...
'' in 2014. Another example is the NES-to- x86 statically recompiled version of the videogame ''
Super Mario Bros. is a platform game developed and published by Nintendo for the Nintendo Entertainment System (NES). The successor to the 1983 arcade game '' Mario Bros.'' and the first game in the ''Super Mario'' series, it was first released in 1985 for ...
'' which was generated under usage of
LLVM LLVM is a set of compiler and toolchain technologies that can be used to develop a front end for any programming language and a back end for any instruction set architecture. LLVM is designed around a language-independent intermediate repre ...
in 2013. In 2004 Scott Elliott and Phillip R. Hutchinson at
Nintendo is a Japanese multinational video game company headquartered in Kyoto, Japan. It develops video games and video game consoles. Nintendo was founded in 1889 as by craftsman Fusajiro Yamauchi and originally produced handmade playing cards ...
developed a tool to generate "C" code from
Game Boy The is an 8-bit fourth generation handheld game console developed and manufactured by Nintendo. It was first released in Japan on April 21, 1989, in North America later the same year, and in Europe in late 1990. It was designed by the same t ...
binary that could then be compiled for a new platform and linked against a hardware library for use in airline entertainment systems. In 1995 Norman Ramsey at
Bell Communications Research iconectiv is a supplier of network planning and network management services to telecommunications providers. Known as Bellcore after its establishment in the United States in 1983 as part of the break-up of the Bell System, the company's name ...
and Mary F. Fernandez at Department of Computer Science,
Princeton University Princeton University is a private research university in Princeton, New Jersey. Founded in 1746 in Elizabeth as the College of New Jersey, Princeton is the fourth-oldest institution of higher education in the United States and one of the ...
developed ''The New Jersey Machine-Code Toolkit'' that had the basic tools for static assembly translation.


Dynamic binary translation

Dynamic binary translation (DBT) looks at a short sequence of code—typically on the order of a single
basic block In compiler construction, a basic block is a straight-line code sequence with no branches in except to the entry and no branches out except at the exit. This restricted form makes a basic block highly amenable to analysis. Compilers usually deco ...
—then translates it and caches the resulting sequence. Code is only translated as it is discovered and when possible, and branch instructions are made to point to already translated and saved code (
memoization In computing, memoization or memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again. Memoization ...
). Dynamic binary translation differs from simple emulation (eliminating the emulator's main read-decode-execute loop—a major performance bottleneck), paying for this by large overhead during translation time. This overhead is hopefully amortized as translated code sequences are executed multiple times. More advanced dynamic translators employ
dynamic recompilation In computer science, dynamic recompilation is a feature of some emulators and virtual machines, where the system may recompile some part of a program during execution. By compiling during execution, the system can tailor the generated code t ...
where the translated code is instrumented to find out what portions are executed a large number of times, and these portions are optimized aggressively. This technique is reminiscent of a JIT compiler, and in fact such compilers (e.g.
Sun The Sun is the star at the center of the Solar System. It is a nearly perfect ball of hot plasma, heated to incandescence by nuclear fusion reactions in its core. The Sun radiates this energy mainly as light, ultraviolet, and infrared radi ...
's HotSpot technology) can be viewed as dynamic translators from a virtual instruction set (the
bytecode Bytecode (also called portable code or p-code) is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references (norma ...
) to a real one.


Examples for dynamic binary translations in software

*
Apple Computer Apple Inc. is an American multinational technology company headquartered in Cupertino, California, United States. Apple is the largest technology company by revenue (totaling in 2021) and, as of June 2022, is the world's biggest company ...
implemented a dynamic translating
emulator In computing, an emulator is hardware or software that enables one computer system (called the ''host'') to behave like another computer system (called the ''guest''). An emulator typically enables the host system to run software or use pe ...
for
M68K The Motorola 68000 series (also known as 680x0, m68000, m68k, or 68k) is a family of 32-bit complex instruction set computer (CISC) microprocessors. During the 1980s and early 1990s, they were popular in personal computers and workstations and w ...
code in their
PowerPC PowerPC (with the backronym Performance Optimization With Enhanced RISC – Performance Computing, sometimes abbreviated as PPC) is a reduced instruction set computer (RISC) instruction set architecture (ISA) created by the 1991 Apple– IBM– ...
line of Macintoshes, which achieved a very high level of reliability, performance and compatibility (see Mac 68K emulator). This allowed Apple to bring the machines to market with only a partially native
operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ef ...
, and end users could adopt the new, faster architecture without risking their investment in software. Partly because the emulator was so successful, many parts of the operating system remained emulated. A full transition to a PowerPC native
operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ef ...
(OS) was not made until the release of
Mac OS X macOS (; previously OS X and originally Mac OS X) is a Unix operating system developed and marketed by Apple Inc. since 2001. It is the primary operating system for Apple's Mac computers. Within the market of desktop and lap ...
(10.0) in 2001. (The Mac OS X "
Classic A classic is an outstanding example of a particular style; something of lasting worth or with a timeless quality; of the first or highest quality, class, or rank – something that exemplifies its class. The word can be an adjective (a ''c ...
" runtime environment continued to offer this emulation capability on PowerPC Macs until
Mac OS X 10.5 Mac OS X Leopard (version 10.5) is the sixth software versioning, major release of macOS, Apple Inc., Apple's desktop and server operating system for Macintosh computers. Leopard was released on October 26, 2007 as the successor of Mac OS X Tig ...
.) * Mac OS X 10.4.4 for Intel-based Macs introduced the
Rosetta Rosetta or Rashid (; ar, رشيد ' ; french: Rosette  ; cop, ϯⲣⲁϣⲓⲧ ''ti-Rashit'', Ancient Greek: Βολβιτίνη ''Bolbitinē'') is a port city of the Nile Delta, east of Alexandria, in Egypt's Beheira governorate. The R ...
dynamic translation layer to ease Apple's transition from PPC-based hardware to x86. Developed for Apple by Transitive Corporation, the Rosetta software is an implementation of Transitive's
QuickTransit QuickTransit was a cross-platform virtualization program developed by Transitive Corporation. It allowed software compiled for one specific processor and operating system combination to be executed on a different processor and/or operating syst ...
solution. * QuickTransit during its product lifespan also provided
SPARC SPARC (Scalable Processor Architecture) is a reduced instruction set computer (RISC) instruction set architecture originally developed by Sun Microsystems. Its design was strongly influenced by the experimental Berkeley RISC system develope ...
x86, x86→
PowerPC PowerPC (with the backronym Performance Optimization With Enhanced RISC – Performance Computing, sometimes abbreviated as PPC) is a reduced instruction set computer (RISC) instruction set architecture (ISA) created by the 1991 Apple– IBM– ...
and MIPS
Itanium 2 Itanium ( ) is a discontinued family of 64-bit Intel microprocessors that implement the Intel Itanium architecture (formerly called IA-64). Launched in June 2001, Intel marketed the processors for enterprise servers and high-performance comput ...
translation support. * DEC achieved similar success with its translation tools to help users migrate from the CISC VAX architecture to the
Alpha Alpha (uppercase , lowercase ; grc, ἄλφα, ''álpha'', or ell, άλφα, álfa) is the first letter of the Greek alphabet. In the system of Greek numerals, it has a value of one. Alpha is derived from the Phoenician letter aleph , whi ...
RISC In computer engineering, a reduced instruction set computer (RISC) is a computer designed to simplify the individual instructions given to the computer to accomplish tasks. Compared to the instructions given to a complex instruction set comp ...
architecture. * HP ARIES (Automatic Re-translation and Integrated Environment Simulation) is a software dynamic binary translation system that combines fast code interpretation with two phase dynamic translation to transparently and accurately execute
HP 9000 HP 9000 is a line of workstation and server computer systems produced by the Hewlett-Packard (HP) Company. The native operating system for almost all HP 9000 systems is HP-UX, which is based on UNIX System V. The HP 9000 brand was introduced ...
HP-UX HP-UX (from "Hewlett Packard Unix") is Hewlett Packard Enterprise's proprietary implementation of the Unix operating system, based on Unix System V (initially System III) and first released in 1984. Current versions support HPE Integrity Se ...
applications on
HP-UX HP-UX (from "Hewlett Packard Unix") is Hewlett Packard Enterprise's proprietary implementation of the Unix operating system, based on Unix System V (initially System III) and first released in 1984. Current versions support HPE Integrity Se ...
11i for
HPE Integrity Servers HPE Integrity is a series of server computers produced by Hewlett Packard Enterprise (formerly Hewlett-Packard) since 2003, based on the Itanium processor. The Integrity brand name was inherited by HP from Tandem Computers via Compaq. In 20 ...
. The ARIES fast interpreter emulates a complete set of non-privileged
PA-RISC PA-RISC is an instruction set architecture (ISA) developed by Hewlett-Packard. As the name implies, it is a reduced instruction set computer (RISC) architecture, where the PA stands for Precision Architecture. The design is also referred to as ...
instructions with no user intervention. During interpretation, it monitors the application's execution pattern and translates only the frequently executed code into native
Itanium Itanium ( ) is a discontinued family of 64-bit Intel microprocessors that implement the Intel Itanium architecture (formerly called IA-64). Launched in June 2001, Intel marketed the processors for enterprise servers and high-performance comput ...
code at runtime. ARIES implements two phase dynamic translation, a technique in which translated code in first phase collects runtime profile information which is used during second phase translation to further optimize the translated code. ARIES stores the dynamically translated code in memory buffer called code cache. Further references to translated basic blocks execute directly in the code cache and do not require additional interpretation or translation. The targets of translated code blocks are back-patched to ensure execution takes place in code cache most of the time. At the end of the emulation, ARIES discards all the translated code without modifying the original application. The ARIES emulation engine also implements Environment Emulation which emulates an
HP 9000 HP 9000 is a line of workstation and server computer systems produced by the Hewlett-Packard (HP) Company. The native operating system for almost all HP 9000 systems is HP-UX, which is based on UNIX System V. The HP 9000 brand was introduced ...
HP-UX HP-UX (from "Hewlett Packard Unix") is Hewlett Packard Enterprise's proprietary implementation of the Unix operating system, based on Unix System V (initially System III) and first released in 1984. Current versions support HPE Integrity Se ...
application's system calls, signal delivery, exception management, threads management, emulation of HP GDB for debugging, and core file creation for the application. * DEC created the
FX!32 FX!32 is a software emulator program that allows Win32 programs built for the Intel x86 instruction set to execute on DEC Alpha-based systems running Windows NT. Released in 1996, FX!32 was developed by Digital Equipment Corporation (DEC) to ...
binary translator for converting x86 applications to Alpha applications. *
Sun Microsystems Sun Microsystems, Inc. (Sun for short) was an American technology company that sold computers, computer components, software, and information technology services and created the Java programming language, the Solaris operating system, ZFS, t ...
' Wabi software included dynamic translation from x86 to SPARC instructions. * In January 2000,
Transmeta Transmeta Corporation was an American fabless semiconductor company based in Santa Clara, California. It developed low power x86 compatible microprocessors based on a VLIW core and a software layer called Code Morphing Software. Code Morphing S ...
Corporation announced a novel processor design named
Crusoe Crusoe may refer to: Art, entertainment, and media * ''Crusoe'' (film), a 1989 film by Caleb Deschanel based on the novel ''Robinson Crusoe'' * ''Crusoe'' (TV series), a 2008 television series based on the novel ''Robinson Crusoe'' * Crusoe the ...
. From the FAQ on their web site, *
Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California. It is the world's largest semiconductor chip manufacturer by revenue, and is one of the developers of the x86 ser ...
Corporation developed and implemented an
IA-32 Execution Layer The IA-32 Execution Layer (IA-32 EL) is a software emulator in the form of a software driver that improves performance of 32-bit applications running on 64-bit Intel Itanium-based systems, particularly those running Linux and Windows Server 2003 (i ...
- a dynamic binary translator designed to support IA-32 applications on
Itanium Itanium ( ) is a discontinued family of 64-bit Intel microprocessors that implement the Intel Itanium architecture (formerly called IA-64). Launched in June 2001, Intel marketed the processors for enterprise servers and high-performance comput ...
-based systems, which was included in
Microsoft Microsoft Corporation is an American multinational technology corporation producing computer software, consumer electronics, personal computers, and related services headquartered at the Microsoft Redmond campus located in Redmond, Washi ...
Windows Server Windows Server (formerly Windows NT Server) is a group of operating systems (OS) for servers that Microsoft has been developing since July 27, 1993. The first OS that was released for this platform was Windows NT 3.1 Advanced Server. With the r ...
for
Itanium Itanium ( ) is a discontinued family of 64-bit Intel microprocessors that implement the Intel Itanium architecture (formerly called IA-64). Launched in June 2001, Intel marketed the processors for enterprise servers and high-performance comput ...
architecture, as well as in several flavors of
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, whi ...
, including
Red Hat Red Hat, Inc. is an American software company that provides open source software products to enterprises. Founded in 1993, Red Hat has its corporate headquarters in Raleigh, North Carolina, with other offices worldwide. Red Hat has become a ...
and
Suse SUSE ( , ) is a German-based multinational open-source software company that develops and sells Linux products to business customers. Founded in 1992, it was the first company to market Linux for enterprise. It is the developer of SUSE Linux Ent ...
. It allowed IA-32 applications to run faster than they would using the native IA-32 mode on Itanium processors. *
Dolphin A dolphin is an aquatic mammal within the infraorder Cetacea. Dolphin species belong to the families Delphinidae (the oceanic dolphins), Platanistidae (the Indian river dolphins), Iniidae (the New World river dolphins), Pontoporiidae (the b ...
(an emulator for the
GameCube The is a home video game console developed and released by Nintendo in Japan on September 14, 2001, in North America on November 18, 2001, and in PAL territories in 2002. It is the successor to the Nintendo 64 (1996), and predecessor of the ...
/
Wii The Wii ( ) is a home video game console developed and marketed by Nintendo. It was released on November 19, 2006, in North America and in December 2006 for most other regions of the world. It is Nintendo's fifth major home game console, ...
) performs JIT recompilation of PowerPC code to x86 and AArch64.


Examples for dynamic binary translations in hardware

* x86 Intel CPUs since the
Pentium Pro The Pentium Pro is a sixth-generation x86 microprocessor developed and manufactured by Intel and introduced on November 1, 1995. It introduced the P6 microarchitecture (sometimes termed i686) and was originally intended to replace the original ...
translate complex CISC x86 instructions to more
RISC In computer engineering, a reduced instruction set computer (RISC) is a computer designed to simplify the individual instructions given to the computer to accomplish tasks. Compared to the instructions given to a complex instruction set comp ...
-like internal
micro-operation In computer central processing units, micro-operations (also known as micro-ops or μops, historically also as micro-actions) are detailed low-level instructions used in some designs to implement complex machine instructions (sometimes termed m ...
s. *
Nvidia Nvidia CorporationOfficially written as NVIDIA and stylized in its logo as VIDIA with the lowercase "n" the same height as the uppercase "VIDIA"; formerly stylized as VIDIA with a large italicized lowercase "n" on products from the mid 1990s to ...
Tegra Tegra is a system on a chip (SoC) series developed by Nvidia for mobile devices such as smartphones, personal digital assistants, and mobile Internet devices. The Tegra integrates an ARM architecture central processing unit (CPU), graphics proc ...
K1 Denver translates
ARM In human anatomy, the arm refers to the upper limb in common usage, although academically the term specifically means the upper arm between the glenohumeral joint (shoulder joint) and the elbow joint. The distal part of the upper limb between th ...
instructions over a slow hardware decoder to its native microcode instructions and uses a software binary translator for hot code.


See also

* Binary optimization * Binary recompilation *
Dynamic recompilation In computer science, dynamic recompilation is a feature of some emulators and virtual machines, where the system may recompile some part of a program during execution. By compiling during execution, the system can tailor the generated code t ...
*
Just-in-time compilation In computing, just-in-time (JIT) compilation (also dynamic translation or run-time compilations) is a way of executing computer code that involves compilation during execution of a program (at run time) rather than before execution. This may co ...
*
Instruction set simulator An instruction set simulator (ISS) is a simulation model, usually coded in a high-level programming language, which mimics the behavior of a mainframe or microprocessor by "reading" instructions and maintaining internal variables which represent t ...
*
Emulator In computing, an emulator is hardware or software that enables one computer system (called the ''host'') to behave like another computer system (called the ''guest''). An emulator typically enables the host system to run software or use pe ...
*
Virtual machine In computing, a virtual machine (VM) is the virtualization/ emulation of a computer system. Virtual machines are based on computer architectures and provide functionality of a physical computer. Their implementations may involve specialized h ...
*
Comparison of platform virtualization software Platform virtualization software, specifically emulators and hypervisors, are software packages that emulate the whole physical computer machine, often providing multiple virtual machines on one physical platform. The table below compares basic ...
*
Shadow memory In computing, shadow memory is a technique used to track and store information on computer memory used by a program during its execution. Shadow memory consists of shadow bytes that map to individual bits or one or more bytes in main memory. These ...


References


Further reading

* * * * * * * {{Refend Emulation software Interpreters (computing) Virtualization