HOME

TheInfoList



OR:

In
computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes, and development of both hardware and software. Computing has scientific, ...
, just-in-time (JIT) compilation (also dynamic translation or run-time compilations) is a way of executing
computer code A computer is a machine that can be programmed to carry out sequences of arithmetic or logical operations (computation) automatically. Modern digital electronic computers can perform generic sets of operations known as programs. These progra ...
that involves compilation during execution of a program (at run time) rather than before execution. This may consist of
source code In computing, source code, or simply code, is any collection of code, with or without comments, written using a human-readable programming language, usually as plain text. The source code of a program is specially designed to facilitate the ...
translation but is more commonly
bytecode Bytecode (also called portable code or p-code) is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references (norma ...
translation to
machine code In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a ve ...
, which is then executed directly. A system implementing a JIT compiler typically continuously analyses the code being executed and identifies parts of the code where the speedup gained from compilation or recompilation would outweigh the overhead of compiling that code. JIT compilation is a combination of the two traditional approaches to translation to machine code—ahead-of-time compilation (AOT), and
interpretation Interpretation may refer to: Culture * Aesthetic interpretation, an explanation of the meaning of a work of art * Allegorical interpretation, an approach that assumes a text should not be interpreted literally * Dramatic Interpretation, an event ...
—and combines some advantages and drawbacks of both. Roughly, JIT compilation combines the speed of compiled code with the flexibility of interpretation, with the overhead of an interpreter and the additional overhead of compiling and linking (not just interpreting). JIT compilation is a form of
dynamic compilation Dynamic compilation is a process used by some programming language implementations to gain performance during program execution. Although the technique originated in Smalltalk,Peter L. Deutsch and Alan Schiffman. "Efficient Implementation of the ...
, and allows
adaptive optimization Adaptive optimization is a technique in computer science that performs dynamic recompilation of portions of a program based on the current execution profile. With a simple implementation, an adaptive optimizer may simply make a trade-off between ...
such as
dynamic recompilation In computer science, dynamic recompilation is a feature of some emulators and virtual machines, where the system may recompile some part of a program during execution. By compiling during execution, the system can tailor the generated code t ...
and
microarchitecture In computer engineering, microarchitecture, also called computer organization and sometimes abbreviated as µarch or uarch, is the way a given instruction set architecture (ISA) is implemented in a particular processor. A given ISA may be imp ...
-specific speedups. Interpretation and JIT compilation are particularly suited for
dynamic programming language In computer science, a dynamic programming language is a class of high-level programming languages, which at runtime execute many common programming behaviours that static programming languages perform during compilation. These behaviors co ...
s, as the runtime system can handle late-bound data types and enforce security guarantees.


History

The earliest published JIT compiler is generally attributed to work on
LISP A lisp is a speech impairment in which a person misarticulates sibilants (, , , , , , , ). These misarticulations often result in unclear speech. Types * A frontal lisp occurs when the tongue is placed anterior to the target. Interdental lispin ...
by John McCarthy in 1960. In his seminal paper ''Recursive functions of symbolic expressions and their computation by machine, Part I'', he mentions functions that are translated during runtime, thereby sparing the need to save the compiler output to
punch card A punched card (also punch card or punched-card) is a piece of stiff paper that holds digital data represented by the presence or absence of holes in predefined positions. Punched cards were once common in data processing applications or to di ...
s (although this would be more accurately known as a " Compile and go system"). Another early example was by
Ken Thompson Kenneth Lane Thompson (born February 4, 1943) is an American pioneer of computer science. Thompson worked at Bell Labs for most of his career where he designed and implemented the original Unix operating system. He also invented the B programmi ...
, who in 1968 gave one of the first applications of
regular expression A regular expression (shortened as regex or regexp; sometimes referred to as rational expression) is a sequence of characters that specifies a search pattern in text. Usually such patterns are used by string-searching algorithms for "find" ...
s, here for
pattern matching In computer science, pattern matching is the act of checking a given sequence of tokens for the presence of the constituents of some pattern. In contrast to pattern recognition, the match usually has to be exact: "either it will or will not be ...
in the text editor QED. For speed, Thompson implemented regular expression matching by JITing to
IBM 7094 The IBM 7090 is a second-generation transistorized version of the earlier IBM 709 vacuum tube mainframe computer that was designed for "large-scale scientific and technological applications". The 7090 is the fourth member of the IBM 700/7000 s ...
code on the
Compatible Time-Sharing System The Compatible Time-Sharing System (CTSS) was the first general purpose time-sharing operating system. Compatible Time Sharing referred to time sharing which was compatible with batch processing; it could offer both time sharing and batch proces ...
. An influential technique for deriving compiled code from interpretation was pioneered by James G. Mitchell in 1970, which he implemented for the experimental language ''LC²''.
Smalltalk Smalltalk is an object-oriented, dynamically typed reflective programming language. It was designed and created in part for educational use, specifically for constructionist learning, at the Learning Research Group (LRG) of Xerox PARC by Alan ...
(c. 1983) pioneered new aspects of JIT compilations. For example, translation to machine code was done on demand, and the result was cached for later use. When memory became scarce, the system would delete some of this code and regenerate it when it was needed again. Sun's
Self The self is an individual as the object of that individual’s own reflective consciousness. Since the ''self'' is a reference by a subject to the same subject, this reference is necessarily subjective. The sense of having a self—or ''selfhoo ...
language improved these techniques extensively and was at one point the fastest Smalltalk system in the world, achieving up to half the speed of optimized C but with a fully object-oriented language. Self was abandoned by Sun, but the research went into the Java language. The term "Just-in-time compilation" was borrowed from the manufacturing term " Just in time" and popularized by Java, with James Gosling using the term from 1993. Currently JITing is used by most implementations of the
Java Virtual Machine A Java virtual machine (JVM) is a virtual machine that enables a computer to run Java programs as well as programs written in other languages that are also compiled to Java bytecode. The JVM is detailed by a specification that formally describe ...
, as HotSpot builds on, and extensively uses, this research base. The HP project Dynamo was an experimental JIT compiler where the 'bytecode' format and the machine code format were the same; the system turned PA-6000 machine code into
PA-8000 The PA-8000 (PCX-U), code-named ''Onyx'', is a microprocessor developed and fabricated by Hewlett-Packard (HP) that implemented the PA-RISC 2.0 instruction set architecture (ISA). Hunt 1995 It was a completely new design with no circuitry deriv ...
machine code. Counterintuitively, this resulted in speed ups, in some cases of 30% since doing this permitted optimizations at the machine code level, for example, inlining code for better cache usage and optimizations of calls to dynamic libraries and many other run-time optimizations which conventional compilers are not able to attempt. In November 2020, PHP 8.0 introduced a JIT compiler.


Design

In a bytecode-compiled system,
source code In computing, source code, or simply code, is any collection of code, with or without comments, written using a human-readable programming language, usually as plain text. The source code of a program is specially designed to facilitate the ...
is translated to an intermediate representation known as
bytecode Bytecode (also called portable code or p-code) is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references (norma ...
. Bytecode is not the machine code for any particular computer, and may be
portable Portable may refer to: General * Portable building, a manufactured structure that is built off site and moved in upon completion of site and utility work * Portable classroom, a temporary building installed on the grounds of a school to provide ...
among computer architectures. The bytecode may then be interpreted by, or run on a
virtual machine In computing, a virtual machine (VM) is the virtualization/ emulation of a computer system. Virtual machines are based on computer architectures and provide functionality of a physical computer. Their implementations may involve specialized h ...
. The JIT compiler reads the bytecodes in many sections (or in full, rarely) and compiles them dynamically into machine code so the program can run faster. This can be done per-file, per-function or even on any arbitrary code fragment; the code can be compiled when it is about to be executed (hence the name "just-in-time"), and then cached and reused later without needing to be recompiled. In contrast, a traditional ''interpreted virtual machine'' will simply interpret the bytecode, generally with much lower performance. Some ''interpreter''s even interpret source code, without the step of first compiling to bytecode, with even worse performance. ''Statically-compiled code'' or ''native code'' is compiled prior to deployment. A ''dynamic compilation environment'' is one in which the compiler can be used during execution. A common goal of using JIT techniques is to reach or surpass the performance of static compilation, while maintaining the advantages of bytecode interpretation: Much of the "heavy lifting" of parsing the original source code and performing basic optimization is often handled at compile time, prior to deployment: compilation from bytecode to machine code is much faster than compiling from source. The deployed bytecode is portable, unlike native code. Since the runtime has control over the compilation, like interpreted bytecode, it can run in a secure sandbox. Compilers from bytecode to machine code are easier to write, because the portable bytecode compiler has already done much of the work. JIT code generally offers far better performance than interpreters. In addition, it can in some cases offer better performance than static compilation, as many optimizations are only feasible at run-time: # The compilation can be optimized to the targeted CPU and the operating system model where the application runs. For example, JIT can choose
SSE2 SSE2 (Streaming SIMD Extensions 2) is one of the Intel SIMD (Single Instruction, Multiple Data) processor supplementary instruction sets first introduced by Intel with the initial version of the Pentium 4 in 2000. It extends the earlier SSE i ...
vector CPU instructions when it detects that the CPU supports them. To obtain this level of optimization specificity with a static compiler, one must either compile a binary for each intended platform/architecture, or else include multiple versions of portions of the code within a single binary. # The system is able to collect statistics about how the program is actually running in the environment it is in, and it can rearrange and recompile for optimum performance. However, some static compilers can also take profile information as input. # The system can do global code optimizations (e.g.
inlining In computing, inline expansion, or inlining, is a manual or compiler optimization that replaces a function call site with the body of the called function. Inline expansion is similar to macro expansion, but occurs during compilation, without cha ...
of library functions) without losing the advantages of dynamic linking and without the overheads inherent to static compilers and linkers. Specifically, when doing global inline substitutions, a static compilation process may need run-time checks and ensure that a virtual call would occur if the actual class of the object overrides the inlined method, and boundary condition checks on array accesses may need to be processed within loops. With just-in-time compilation in many cases this processing can be moved out of loops, often giving large increases of speed. # Although this is possible with statically compiled garbage collected languages, a bytecode system can more easily rearrange executed code for better cache utilization. Because a JIT must render and execute a native binary image at runtime, true machine-code JITs necessitate platforms that allow for data to be executed at runtime, making using such JITs on a
Harvard architecture The Harvard architecture is a computer architecture with separate storage and signal pathways for instructions and data. It contrasts with the von Neumann architecture, where program instructions and data share the same memory and pathway ...
-based machine impossible; the same can be said for certain operating systems and virtual machines as well. However, a special type of "JIT" may potentially ''not'' target the physical machine's CPU architecture, but rather an optimized VM bytecode where limitations on raw machine code prevail, especially where that bytecode's VM eventually leverages a JIT to native code.


Performance

JIT causes a slight to noticeable delay in the initial execution of an application, due to the time taken to load and compile the bytecode. Sometimes this delay is called "startup time delay" or "warm-up time". In general, the more optimization JIT performs, the better the code it will generate, but the initial delay will also increase. A JIT compiler therefore has to make a trade-off between the compilation time and the quality of the code it hopes to generate. Startup time can include increased IO-bound operations in addition to JIT compilation: for example, the ''rt.jar'' class data file for the
Java Virtual Machine A Java virtual machine (JVM) is a virtual machine that enables a computer to run Java programs as well as programs written in other languages that are also compiled to Java bytecode. The JVM is detailed by a specification that formally describe ...
(JVM) is 40 MB and the JVM must seek a lot of data in this contextually huge file. One possible optimization, used by Sun's HotSpot Java Virtual Machine, is to combine interpretation and JIT compilation. The application code is initially interpreted, but the JVM monitors which sequences of
bytecode Bytecode (also called portable code or p-code) is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references (norma ...
are frequently executed and translates them to machine code for direct execution on the hardware. For bytecode which is executed only a few times, this saves the compilation time and reduces the initial latency; for frequently executed bytecode, JIT compilation is used to run at high speed, after an initial phase of slow interpretation. Additionally, since a program spends most time executing a minority of its code, the reduced compilation time is significant. Finally, during the initial code interpretation, execution statistics can be collected before compilation, which helps to perform better optimization. The correct tradeoff can vary due to circumstances. For example, Sun's Java Virtual Machine has two major modes—client and server. In client mode, minimal compilation and optimization is performed, to reduce startup time. In server mode, extensive compilation and optimization is performed, to maximize performance once the application is running by sacrificing startup time. Other Java just-in-time compilers have used a runtime measurement of the number of times a method has executed combined with the bytecode size of a method as a heuristic to decide when to compile. Still another uses the number of times executed combined with the detection of loops. In general, it is much harder to accurately predict which methods to optimize in short-running applications than in long-running ones.
Native Image Generator The Native Image Generator, or simply NGen, is the ahead-of-time compilation (AOT) service of the .NET Framework. It allows a CLI assembly to be pre-compiled instead of letting the Common Language Runtime (CLR) do a just-in-time compilation (JI ...
(Ngen) by
Microsoft Microsoft Corporation is an American multinational technology corporation producing computer software, consumer electronics, personal computers, and related services headquartered at the Microsoft Redmond campus located in Redmond, Washi ...
is another approach at reducing the initial delay. Ngen pre-compiles (or "pre-JITs") bytecode in a
Common Intermediate Language Common Intermediate Language (CIL), formerly called Microsoft Intermediate Language (MSIL) or Intermediate Language (IL), is the intermediate language binary instruction set defined within the Common Language Infrastructure (CLI) specification. ...
image into machine native code. As a result, no runtime compilation is needed.
.NET Framework The .NET Framework (pronounced as "''dot net"'') is a proprietary software framework developed by Microsoft that runs primarily on Microsoft Windows. It was the predominant implementation of the Common Language Infrastructure (CLI) until bein ...
2.0 shipped with Visual Studio 2005 runs Ngen on all of the Microsoft library DLLs right after the installation. Pre-jitting provides a way to improve the startup time. However, the quality of code it generates might not be as good as the one that is JITed, for the same reasons why code compiled statically, without
profile-guided optimization Profile-guided optimization (PGO, sometimes pronounced as ''pogo''), also known as profile-directed feedback (PDF), and feedback-directed optimization (FDO) is a compiler optimization technique in computer programming that uses profiling to imp ...
, cannot be as good as JIT compiled code in the extreme case: the lack of profiling data to drive, for instance, inline caching. There also exist Java implementations that combine an AOT (ahead-of-time) compiler with either a JIT compiler (
Excelsior JET Excelsior JET is a now-defunct proprietary Java SE technology implementation built around an ahead-of-time (AOT) Java to native code compiler. The compiler transforms the portable Java bytecode into optimized executables for the desired hardware a ...
) or interpreter (
GNU Compiler for Java The GNU Compiler for Java (GCJ) is a free compiler for the Java programming language. It was part of the GNU Compiler Collection for over ten years but as of 2017 it is no longer maintained and will not be part of future releases. GCJ compiles ...
). JIT compilation may not reliably achieve its goal, namely entering a steady state of improved performance after a short initial warmup period. Across eight different virtual machines, measured six widely-used microbenchmarks which are commonly used by virtual machine implementors as optimisation targets, running them repeatedly within a single process execution. On
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, whi ...
, they found that 8.7% to 9.6% of process executions failed to reach a steady state of performance, 16.7% to 17.9% entered a steady state of ''reduced'' performance after a warmup period, and 56.5% pairings of a specific virtual machine running a specific benchmark failed to consistently see a steady-state non-degradation of performance across multiple executions (i.e., at least one execution failed to reach a steady state or saw reduced performance in the steady state). Even where an improved steady-state was reached, it sometimes took many hundreds of iterations. instead focused on the HotSpot virtual machine but with a much wider array of benchmarks, finding that 10.9% of process executions failed to reach a steady state of performance, and 43.5% of benchmarks did not consistently attain a steady state across multiple executions.


Security

JIT compilation fundamentally uses executable data, and thus poses security challenges and possible exploits. Implementation of JIT compilation consists of compiling source code or byte code to machine code and executing it. This is generally done directly in memory: the JIT compiler outputs the machine code directly into memory and immediately executes it, rather than outputting it to disk and then invoking the code as a separate program, as in usual ahead of time compilation. In modern architectures this runs into a problem due to
executable space protection In computer security, executable-space protection marks memory regions as non-executable, such that an attempt to execute machine code in these regions will cause an exception. It makes use of hardware features such as the NX bit (no-execute bi ...
: arbitrary memory cannot be executed, as otherwise there is a potential security hole. Thus memory must be marked as executable; for security reasons this should be done ''after'' the code has been written to memory, and marked read-only, as writable/executable memory is a security hole (see W^X). For instance Firefox's JIT compiler for Javascript introduced this protection in a release version with Firefox 46.
JIT spraying JIT spraying is a class of computer security exploit that circumvents the protection of address space layout randomization (ASLR) and data execution prevention (DEP) by exploiting the behavior of just-in-time compilation. It has been used to expl ...
is a class of
computer security exploit An exploit (from the English verb ''to exploit'', meaning "to use something to one’s own advantage") is a piece of software, a chunk of data, or a sequence of commands that takes advantage of a bug or vulnerability to cause unintended or unanti ...
s that use JIT compilation for heap spraying: the resulting memory is then executable, which allows an exploit if execution can be moved into the heap.


Uses

JIT compilation can be applied to some programs, or can be used for certain capacities, particularly dynamic capacities such as
regular expression A regular expression (shortened as regex or regexp; sometimes referred to as rational expression) is a sequence of characters that specifies a search pattern in text. Usually such patterns are used by string-searching algorithms for "find" ...
s. For example, a text editor may compile a regular expression provided at runtime to machine code to allow faster matching: this cannot be done ahead of time, as the pattern is only provided at runtime. Several modern
runtime environment In computer programming, a runtime system or runtime environment is a sub-system that exists both in the computer where a program is created, as well as in the computers where the program is intended to be run. The name comes from the compile ...
s rely on JIT compilation for high-speed code execution, including most implementations of
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mo ...
, together with
Microsoft Microsoft Corporation is an American multinational technology corporation producing computer software, consumer electronics, personal computers, and related services headquartered at the Microsoft Redmond campus located in Redmond, Washi ...
's .NET. Similarly, many regular-expression libraries feature JIT compilation of regular expressions, either to bytecode or to machine code. JIT compilation is also used in some emulators, in order to translate machine code from one CPU architecture to another. A common implementation of JIT compilation is to first have AOT compilation to bytecode (
virtual machine In computing, a virtual machine (VM) is the virtualization/ emulation of a computer system. Virtual machines are based on computer architectures and provide functionality of a physical computer. Their implementations may involve specialized h ...
code), known as ''bytecode compilation'', and then have JIT compilation to machine code (dynamic compilation), rather than interpretation of the bytecode. This improves the runtime performance compared to interpretation, at the cost of lag due to compilation. JIT compilers translate continuously, as with interpreters, but caching of compiled code minimizes lag on future execution of the same code during a given run. Since only part of the program is compiled, there is significantly less lag than if the entire program were compiled prior to execution.


See also

*
Binary translation In computing, binary translation is a form of binary recompilation where sequences of instructions are translated from a ''source'' instruction set to the ''target'' instruction set. In some cases such as instruction set simulation, the target ...
*
Common Language Runtime The Common Language Runtime (CLR), the virtual machine component of Microsoft .NET Framework, manages the execution of .NET programs. Just-in-time compilation converts the managed code (compiled intermediate language code) into machine instru ...
*
Dynamic compilation Dynamic compilation is a process used by some programming language implementations to gain performance during program execution. Although the technique originated in Smalltalk,Peter L. Deutsch and Alan Schiffman. "Efficient Implementation of the ...
* GNU lightning *
LLVM LLVM is a set of compiler and toolchain technologies that can be used to develop a front end for any programming language and a back end for any instruction set architecture. LLVM is designed around a language-independent intermediate repre ...
* OVPsim * Self-modifying code *
Tracing just-in-time compilation Tracing just-in-time compilation is a technique used by virtual machines to optimize the execution of a program at runtime. This is done by recording a linear sequence of frequently executed operations, compiling them to native machine code an ...
*
Transmeta Crusoe The Transmeta Crusoe is a family of x86-compatible microprocessors developed by Transmeta and introduced in 2000. Instead of the instruction set architecture being implemented in hardware, or translated by specialized hardware, the Crusoe runs a ...


Notes


References


Bibliography

* * * *


External links


Free Online Dictionary of Computing entry

Mozilla Nanojit
{{Webarchive, url=https://web.archive.org/web/20120509215101/https://developer.mozilla.org/En/Nanojit , date=2012-05-09 : A small, cross-platform C++ library that emits machine code. It is used as the JIT for the Mozilla
Tamarin The tamarins are squirrel-sized New World monkeys from the family Callitrichidae in the genus ''Saguinus''. They are the first offshoot in the Callitrichidae tree, and therefore are the sister group of a clade formed by the lion tamarins, Goe ...
and
SpiderMonkey SpiderMonkey is the first JavaScript engine, written by Brendan Eich at Netscape Communications, later released as open source and currently maintained by the Mozilla Foundation. It is used in the Firefox web browser. History Eich "wrote ...
Javascript engines.
Profiling Runtime Generated and Interpreted Code using the VTune Performance Analyzer
Compiler construction Emulation software Virtualization