HOME

TheInfoList



OR:

LLVM, also called LLVM Core, is a target-independent optimizer and code generator. It can be used to develop a frontend for any
programming language A programming language is a system of notation for writing computer programs. Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...
and a backend for any
instruction set architecture In computer science, an instruction set architecture (ISA) is an abstract model that generally defines how software controls the CPU in a computer or a family of computers. A device or program that executes instructions described by that ISA, ...
. LLVM is designed around a language-independent
intermediate representation An intermediate representation (IR) is the data structure or code used internally by a compiler or virtual machine to represent source code. An IR is designed to be conducive to further processing, such as optimization and translation. A "good" ...
(IR) that serves as a
portable Portable may refer to: General * Portable building, a manufactured structure that is built off site and moved in upon completion of site and utility work * Portable classroom, a temporary building installed on the grounds of a school to provide a ...
, high-level
assembly language In computing, assembly language (alternatively assembler language or symbolic machine code), often referred to simply as assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence bet ...
that can be optimized with a variety of transformations over multiple passes. The name ''LLVM'' originally stood for ''Low Level Virtual Machine.'' However, the project has since expanded, and the name is no longer an acronym but an orphan initialism. LLVM is written in C++ and is designed for compile-time, link-time, runtime, and "idle-time" optimization. Originally implemented for C and C++, the language-agnostic design of LLVM has since spawned a wide variety of frontends: languages with compilers that use LLVM (or which do not directly use LLVM but can generate compiled programs as LLVM IR) include
ActionScript ActionScript is an object-oriented programming language originally developed by Macromedia Inc. (later acquired by Adobe). It is influenced by HyperTalk, the scripting language for HyperCard. It is now an implementation of ECMAScript (mean ...
, Ada, C# for
.NET The .NET platform (pronounced as "''dot net"'') is a free and open-source, managed code, managed computer software framework for Microsoft Windows, Windows, Linux, and macOS operating systems. The project is mainly developed by Microsoft emplo ...
,
Common Lisp Common Lisp (CL) is a dialect of the Lisp programming language, published in American National Standards Institute (ANSI) standard document ''ANSI INCITS 226-1994 (S2018)'' (formerly ''X3.226-1994 (R1999)''). The Common Lisp HyperSpec, a hyperli ...
, PicoLisp,
Crystal A crystal or crystalline solid is a solid material whose constituents (such as atoms, molecules, or ions) are arranged in a highly ordered microscopic structure, forming a crystal lattice that extends in all directions. In addition, macros ...
,
CUDA In computing, CUDA (Compute Unified Device Architecture) is a proprietary parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for accelerated gene ...
, D,
Delphi Delphi (; ), in legend previously called Pytho (Πυθώ), was an ancient sacred precinct and the seat of Pythia, the major oracle who was consulted about important decisions throughout the ancient Classical antiquity, classical world. The A ...
, Dylan, Forth, Fortran,
FreeBASIC FreeBASIC is a FOSS, free and open source multiplatform compiler and programming language based on BASIC licensed under the GNU General Public License, GNU GPL for Microsoft Windows, protected-mode MS-DOS (DOS extender), Linux, FreeBSD and Xbox ...
,
Free Pascal Free Pascal Compiler (FPC) is a compiler for the closely related programming-language dialects Pascal and Object Pascal. It is free software released under the GNU General Public License, witexception clausesthat allow static linking against it ...
, Halide,
Haskell Haskell () is a general-purpose, statically typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research, and industrial applications, Haskell pioneered several programming language ...
, Idris, Jai (only for optimized release builds),
Java bytecode Java bytecode is the instruction set of the Java virtual machine (JVM), the language to which Java and other JVM-compatible source code is compiled. Each instruction is represented by a single byte, hence the name bytecode, making it a compact ...
, Julia, Kotlin, LabVIEW's G language,
Objective-C Objective-C is a high-level general-purpose, object-oriented programming language that adds Smalltalk-style message passing (messaging) to the C programming language. Originally developed by Brad Cox and Tom Love in the early 1980s, it was ...
,
OpenCL OpenCL (Open Computing Language) is a software framework, framework for writing programs that execute across heterogeneous computing, heterogeneous platforms consisting of central processing units (CPUs), graphics processing units (GPUs), di ...
,
PostgreSQL PostgreSQL ( ) also known as Postgres, is a free and open-source software, free and open-source relational database management system (RDBMS) emphasizing extensibility and SQL compliance. PostgreSQL features transaction processing, transactions ...
's SQL and PLpgSQL,
Ruby Ruby is a pinkish-red-to-blood-red-colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sapph ...
,
Rust Rust is an iron oxide, a usually reddish-brown oxide formed by the reaction of iron and oxygen in the catalytic presence of water or air moisture. Rust consists of hydrous iron(III) oxides (Fe2O3·nH2O) and iron(III) oxide-hydroxide (FeO(OH) ...
, Scala,
Standard ML Standard ML (SML) is a General-purpose programming language, general-purpose, High-level programming language, high-level, Modular programming, modular, Functional programming, functional programming language with compile-time type checking and t ...
,
Swift Swift or SWIFT most commonly refers to: * SWIFT, an international organization facilitating transactions between banks ** SWIFT code * Swift (programming language) * Swift (bird), a family of birds It may also refer to: Organizations * SWIF ...
, Xojo, and Zig.


History

The LLVM project started in 2000 at the
University of Illinois at Urbana–Champaign The University of Illinois Urbana-Champaign (UIUC, U of I, Illinois, or University of Illinois) is a public land-grant research university in the Champaign–Urbana metropolitan area, Illinois, United States. Established in 1867, it is the f ...
, under the direction of Vikram Adve and
Chris Lattner Christopher Arthur Lattner (born 1978) is an American software engineer and creator of LLVM, the Clang compiler, the Swift (programming language), Swift programming language and the MLIR (software), MLIR compiler infrastructure. After his PhD ...
. LLVM was originally developed as a research infrastructure to investigate dynamic compilation techniques for static and dynamic programming languages. LLVM was released under the
University of Illinois/NCSA Open Source License The University of Illinois/NCSA Open Source License is a permissive free software license, based on the MIT/X11 license and the 3-clause BSD license. By combining parts of these two licenses, it attempts to be clearer and more concise than ...
, a permissive free software licence. In 2005,
Apple Inc. Apple Inc. is an American multinational corporation and technology company headquartered in Cupertino, California, in Silicon Valley. It is best known for its consumer electronics, software, and services. Founded in 1976 as Apple Comput ...
hired Lattner and formed a team to work on the LLVM system for various uses within Apple's development systems. LLVM has been an integral part of Apple's
Xcode Xcode is a suite of developer tools for building apps on Apple devices. It includes an integrated development environment (IDE) of the same name for macOS, used to develop software for macOS, iOS, iPadOS, watchOS, tvOS, and visionOS. It w ...
development tools for
macOS macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...
and
iOS Ios, Io or Nio (, ; ; locally Nios, Νιός) is a Greek island in the Cyclades group in the Aegean Sea. Ios is a hilly island with cliffs down to the sea on most sides. It is situated halfway between Naxos and Santorini. It is about long an ...
since Xcode 4 in 2011. In 2006, Lattner started working on a new project named
Clang Clang () is a compiler front end for the programming languages C, C++, Objective-C, Objective-C++, and the software frameworks OpenMP, OpenCL, RenderScript, CUDA, SYCL, and HIP. It acts as a drop-in replacement for the GNU Compiler ...
. The combination of the Clang frontend and LLVM backend is named Clang/LLVM or simply Clang. The name ''LLVM'' was originally an
initialism An acronym is a type of abbreviation consisting of a phrase whose only pronounced elements are the initial letters or initial sounds of words inside that phrase. Acronyms are often spelled with the initial letter of each word in all caps wi ...
for ''Low Level Virtual Machine''. However, the LLVM project evolved into an umbrella project that has little relationship to what most current developers think of as a
virtual machine In computing, a virtual machine (VM) is the virtualization or emulator, emulation of a computer system. Virtual machines are based on computer architectures and provide the functionality of a physical computer. Their implementations may involve ...
. This made the initialism "confusing" and "inappropriate", and since 2011 LLVM is "officially no longer an acronym", but a brand that applies to the LLVM umbrella project. The project encompasses the LLVM
intermediate representation An intermediate representation (IR) is the data structure or code used internally by a compiler or virtual machine to represent source code. An IR is designed to be conducive to further processing, such as optimization and translation. A "good" ...
(IR), the LLVM
debugger A debugger is a computer program used to test and debug other programs (the "target" programs). Common features of debuggers include the ability to run or halt the target program using breakpoints, step through code line by line, and display ...
, the LLVM implementation of the
C++ Standard Library The C standard library, sometimes referred to as libc, is the standard library for the C programming language, as specified in the ISO C standard.ISO/ IEC (2018). '' ISO/IEC 9899:2018(E): Programming Languages - C §7'' Starting from the origina ...
(with full support of
C++11 C++11 is a version of a joint technical standard, ISO/IEC 14882, by the International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC), for the C++ programming language. C++11 replaced the prior vers ...
and
C++14 C14, C.XIV or C-14 may refer to: Time * The 14th century * Carbon-14, a radioactive isotope of carbon ** Radiocarbon dating, C-14 dating, a method for dating events Science * IEC 60320#C14, IEC 60320 C14, a polarised, three pole socket electrical ...
), etc. LLVM is administered by the LLVM Foundation. Compiler engineer Tanya Lattner became its president in 2014 and was in post . ''"For designing and implementing LLVM"'', the
Association for Computing Machinery The Association for Computing Machinery (ACM) is a US-based international learned society for computing. It was founded in 1947 and is the world's largest scientific and educational computing society. The ACM is a non-profit professional membe ...
presented Vikram Adve, Chris Lattner, and Evan Cheng with the 2012 ACM Software System Award. The project was originally available under the UIUC license. After v9.0.0 released in 2019, LLVM relicensed to the Apache License 2.0 with LLVM Exceptions. about 400 contributions had not been relicensed.


Features

LLVM can provide the middle layers of a complete compiler system, taking
intermediate representation An intermediate representation (IR) is the data structure or code used internally by a compiler or virtual machine to represent source code. An IR is designed to be conducive to further processing, such as optimization and translation. A "good" ...
(IR) code from a
compiler In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
and emitting an optimized IR. This new IR can then be converted and linked into machine-dependent
assembly language In computing, assembly language (alternatively assembler language or symbolic machine code), often referred to simply as assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence bet ...
code for a target platform. LLVM can accept the IR from the
GNU Compiler Collection The GNU Compiler Collection (GCC) is a collection of compilers from the GNU Project that support various programming languages, Computer architecture, hardware architectures, and operating systems. The Free Software Foundation (FSF) distributes ...
(GCC)
toolchain A toolchain is a set of software development tools used to build and otherwise develop software. Often, the tools are executed sequentially and form a pipeline such that the output of one tool is the input for the next. Sometimes the term is us ...
, allowing it to be used with a wide array of extant compiler front-ends written for that project. LLVM can also be built with gcc after version 7.5. LLVM can also generate relocatable machine code at compile-time or link-time or even binary machine code at runtime. LLVM supports a language-independent
instruction set In computer science, an instruction set architecture (ISA) is an abstract model that generally defines how software controls the CPU in a computer or a family of computers. A device or program that executes instructions described by that ISA, s ...
and
type system In computer programming, a type system is a logical system comprising a set of rules that assigns a property called a ''type'' (for example, integer, floating point, string) to every '' term'' (a word, phrase, or other set of symbols). Usu ...
. Each instruction is in static single assignment form (SSA), meaning that each variable (called a typed register) is assigned once and then frozen. This helps simplify the analysis of dependencies among variables. LLVM allows code to be compiled statically, as it is under the traditional GCC system, or left for late-compiling from the IR to machine code via just-in-time compilation (JIT), similar to
Java Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
. The type system consists of basic types such as
integer An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...
or
floating-point In computing, floating-point arithmetic (FP) is arithmetic on subsets of real numbers formed by a ''significand'' (a Sign (mathematics), signed sequence of a fixed number of digits in some Radix, base) multiplied by an integer power of that ba ...
numbers and five derived types:
pointers Pointer may refer to: People with the name * Pointer (surname), a surname (including a list of people with the name) * Pointer Williams (born 1974), American former basketball player Arts, entertainment, and media * ''Pointer'' (journal), the ...
,
arrays An array is a systematic arrangement of similar objects, usually in rows and columns. Things called an array include: {{TOC right Music * In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
, vectors, structures, and functions. A type construct in a concrete language can be represented by combining these basic types in LLVM. For example, a class in C++ can be represented by a mix of structures, functions and arrays of
function pointer A function pointer, also called a subroutine pointer or procedure pointer, is a pointer referencing executable code, rather than data. Dereferencing the function pointer yields the referenced function, which can be invoked and passed arguments ...
s. The LLVM JIT compiler can optimize unneeded static branches out of a program at runtime, and thus is useful for partial evaluation in cases where a program has many options, most of which can easily be determined unneeded in a specific environment. This feature is used in the
OpenGL OpenGL (Open Graphics Library) is a Language-independent specification, cross-language, cross-platform application programming interface (API) for rendering 2D computer graphics, 2D and 3D computer graphics, 3D vector graphics. The API is typic ...
pipeline of
Mac OS X Leopard Mac OS X Leopard (version 10.5) is the sixth software versioning, major release of macOS, Apple Inc., Apple's desktop and server operating system for Macintosh computers. Leopard was released on October 26, 2007, as the successor of Mac OS X Ti ...
(v10.5) to provide support for missing hardware features. Graphics code within the OpenGL stack can be left in intermediate representation and then compiled when run on the target machine. On systems with high-end
graphics processing unit A graphics processing unit (GPU) is a specialized electronic circuit designed for digital image processing and to accelerate computer graphics, being present either as a discrete video card or embedded on motherboards, mobile phones, personal ...
s (GPUs), the resulting code remains quite thin, passing the instructions on to the GPU with minimal changes. On systems with low-end GPUs, LLVM will compile optional procedures that run on the local
central processing unit A central processing unit (CPU), also called a central processor, main processor, or just processor, is the primary Processor (computing), processor in a given computer. Its electronic circuitry executes Instruction (computing), instructions ...
(CPU) that emulate instructions that the GPU cannot run internally. LLVM improved performance on low-end machines using
Intel GMA The Intel Graphics Media Accelerator (GMA) is a series of integrated graphics processors introduced in 2004 by Intel, replacing the earlier Intel Extreme Graphics series and being succeeded by the Intel HD and Iris Graphics series. This serie ...
chipsets. A similar system was developed under the Gallium3D LLVMpipe, and incorporated into the
GNOME A gnome () is a mythological creature and diminutive spirit in Renaissance magic and alchemy, introduced by Paracelsus in the 16th century and widely adopted by authors, including those of modern fantasy literature. They are typically depict ...
shell to allow it to run without a proper 3D hardware driver loaded. In 2011, programs compiled by GCC outperformed those from LLVM by 10%, on average. In 2013,
phoronix Phoronix Test Suite (PTS) is a free and open-source benchmark software for Linux and other operating systems. The Phoronix Test Suite, developed by Michael Larabel and Matthew Tippett, has been endorsed by sites such as Linux.com, LinuxPlanet ...
reported that LLVM had caught up with GCC, compiling binaries of approximately equal performance.


Components

LLVM has become an umbrella project containing multiple components.


Frontends

LLVM was originally written to be a replacement for the extant code generator in the GCC stack, and many of the GCC frontends have been modified to work with it, resulting in the now-defunct LLVM-GCC suite. The modifications generally involve a
GIMPLE The GNU Compiler Collection (GCC) is a collection of compilers from the GNU Project that support various programming languages, hardware architectures, and operating systems. The Free Software Foundation (FSF) distributes GCC as free software ...
-to-LLVM IR step so that LLVM optimizers and codegen can be used instead of GCC's GIMPLE system. Apple was a significant user of LLVM-GCC through
Xcode Xcode is a suite of developer tools for building apps on Apple devices. It includes an integrated development environment (IDE) of the same name for macOS, used to develop software for macOS, iOS, iPadOS, watchOS, tvOS, and visionOS. It w ...
4.x (2013). This use of the GCC frontend was considered mostly a temporary measure, but with the advent of
Clang Clang () is a compiler front end for the programming languages C, C++, Objective-C, Objective-C++, and the software frameworks OpenMP, OpenCL, RenderScript, CUDA, SYCL, and HIP. It acts as a drop-in replacement for the GNU Compiler ...
and advantages of LLVM and Clang's modern and modular codebase (as well as compilation speed), is mostly obsolete. LLVM currently supports compiling of Ada, C, C++, D,
Delphi Delphi (; ), in legend previously called Pytho (Πυθώ), was an ancient sacred precinct and the seat of Pythia, the major oracle who was consulted about important decisions throughout the ancient Classical antiquity, classical world. The A ...
, Fortran,
Haskell Haskell () is a general-purpose, statically typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research, and industrial applications, Haskell pioneered several programming language ...
, Julia,
Objective-C Objective-C is a high-level general-purpose, object-oriented programming language that adds Smalltalk-style message passing (messaging) to the C programming language. Originally developed by Brad Cox and Tom Love in the early 1980s, it was ...
,
Rust Rust is an iron oxide, a usually reddish-brown oxide formed by the reaction of iron and oxygen in the catalytic presence of water or air moisture. Rust consists of hydrous iron(III) oxides (Fe2O3·nH2O) and iron(III) oxide-hydroxide (FeO(OH) ...
, and
Swift Swift or SWIFT most commonly refers to: * SWIFT, an international organization facilitating transactions between banks ** SWIFT code * Swift (programming language) * Swift (bird), a family of birds It may also refer to: Organizations * SWIF ...
using various frontends. Widespread interest in LLVM has led to several efforts to develop new frontends for many languages. The one that has received the most attention is Clang, a newer compiler supporting C, C++, and Objective-C. Primarily supported by Apple, Clang is aimed at replacing the C/Objective-C compiler in the GCC system with a system that is more easily integrated with
integrated development environment An integrated development environment (IDE) is a Application software, software application that provides comprehensive facilities for software development. An IDE normally consists of at least a source-code editor, build automation tools, an ...
s (IDEs) and has wider support for multithreading. Support for
OpenMP OpenMP is an application programming interface (API) that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran, on many platforms, instruction-set architectures and operating systems, including Solaris, ...
directives has been included in
Clang Clang () is a compiler front end for the programming languages C, C++, Objective-C, Objective-C++, and the software frameworks OpenMP, OpenCL, RenderScript, CUDA, SYCL, and HIP. It acts as a drop-in replacement for the GNU Compiler ...
since release 3.8. The
Utrecht Utrecht ( ; ; ) is the List of cities in the Netherlands by province, fourth-largest city of the Netherlands, as well as the capital and the most populous city of the Provinces of the Netherlands, province of Utrecht (province), Utrecht. The ...
Haskell Haskell () is a general-purpose, statically typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research, and industrial applications, Haskell pioneered several programming language ...
compiler can generate code for LLVM. While the generator was in early stages of development, in many cases it was more efficient than the C code generator. The Glasgow Haskell Compiler (GHC) backend uses LLVM and achieves a 30% speed-up of compiled code relative to native code compiling via GHC or C code generation followed by compiling, missing only one of the many optimizing techniques implemented by the GHC. Many other components are in various stages of development, including, but not limited to, the
Rust Rust is an iron oxide, a usually reddish-brown oxide formed by the reaction of iron and oxygen in the catalytic presence of water or air moisture. Rust consists of hydrous iron(III) oxides (Fe2O3·nH2O) and iron(III) oxide-hydroxide (FeO(OH) ...
compiler, a
Java bytecode Java bytecode is the instruction set of the Java virtual machine (JVM), the language to which Java and other JVM-compatible source code is compiled. Each instruction is represented by a single byte, hence the name bytecode, making it a compact ...
frontend, a
Common Intermediate Language Common Intermediate Language (CIL), formerly called Microsoft Intermediate Language (MSIL) or Intermediate Language (IL), is the intermediate language binary instruction set defined within the Common Language Infrastructure (CLI) specification. ...
(CIL) frontend, the MacRuby implementation of Ruby 1.9, various frontends for
Standard ML Standard ML (SML) is a General-purpose programming language, general-purpose, High-level programming language, high-level, Modular programming, modular, Functional programming, functional programming language with compile-time type checking and t ...
, and a new
graph coloring In graph theory, graph coloring is a methodic assignment of labels traditionally called "colors" to elements of a Graph (discrete mathematics), graph. The assignment is subject to certain constraints, such as that no two adjacent elements have th ...
register allocator.


Intermediate representation

The core of LLVM is the
intermediate representation An intermediate representation (IR) is the data structure or code used internally by a compiler or virtual machine to represent source code. An IR is designed to be conducive to further processing, such as optimization and translation. A "good" ...
(IR), a
low-level programming language A low-level programming language is a programming language that provides little or no Abstraction (computer science), abstraction from a computer's instruction set architecture, memory or underlying physical hardware; commands or functions in the ...
similar to assembly. IR is a strongly typed
reduced instruction set computer In electronics and computer science, a reduced instruction set computer (RISC) is a computer architecture designed to simplify the individual instructions given to the computer to accomplish tasks. Compared to the instructions given to a com ...
(RISC) instruction set which abstracts away most details of the target. For example, the calling convention is abstracted through call and ret instructions with explicit arguments. Also, instead of a fixed set of registers, IR uses an infinite set of temporaries of the form %0, %1, etc. LLVM supports three equivalent forms of IR: a human-readable assembly format, an in-memory format suitable for frontends, and a dense bitcode format for serializing. A simple
"Hello, world!" program A "Hello, World!" program is usually a simple computer program that emits (or displays) to the screen (often the Console application, console) a message similar to "Hello, World!". A small piece of code in most general-purpose programming languag ...
in the human-readable IR format: @.str = internal constant 4 x i8c"Hello, world\0A\00" declare i32 @printf(ptr, ...) define i32 @main(i32 %argc, ptr %argv) nounwind The many different conventions used and features provided by different targets mean that LLVM cannot truly produce a target-independent IR and retarget it without breaking some established rules. Examples of target dependence beyond what is explicitly mentioned in the documentation can be found in a 2011 proposal for "wordcode", a fully target-independent variant of LLVM IR intended for online distribution. A more practical example is PNaCl. The LLVM project also introduces another type of intermediate representation named MLIR which helps build reusable and extensible compiler infrastructure by employing a plugin architecture named Dialect. It enables the use of higher-level information on the program structure in the process of optimization including polyhedral compilation.


Backends

At version 16, LLVM supports many
instruction set In computer science, an instruction set architecture (ISA) is an abstract model that generally defines how software controls the CPU in a computer or a family of computers. A device or program that executes instructions described by that ISA, s ...
s, including
IA-32 IA-32 (short for "Intel Architecture, 32-bit", commonly called ''i386'') is the 32-bit version of the x86 instruction set architecture, designed by Intel and first implemented in the i386, 80386 microprocessor in 1985. IA-32 is the first incarn ...
,
x86-64 x86-64 (also known as x64, x86_64, AMD64, and Intel 64) is a 64-bit extension of the x86 instruction set architecture, instruction set. It was announced in 1999 and first available in the AMD Opteron family in 2003. It introduces two new ope ...
, ARM,
Qualcomm Hexagon Hexagon is the brand name for a family of digital signal processor (DSP) and later neural processing unit (NPU) products by Qualcomm. Hexagon is also known as QDSP6, standing for “sixth generation digital signal processor.” According to Qua ...
, LoongArch,
M68K The Motorola 68000 series (also known as 680x0, m68000, m68k, or 68k) is a family of 32-bit complex instruction set computer (CISC) microprocessors. During the 1980s and early 1990s, they were popular in personal computers and workstations and w ...
, MIPS,
NVIDIA Nvidia Corporation ( ) is an American multinational corporation and technology company headquartered in Santa Clara, California, and incorporated in Delaware. Founded in 1993 by Jensen Huang (president and CEO), Chris Malachowsky, and Curti ...
Parallel Thread Execution (PTX, also named ''NVPTX'' in LLVM documentation),
PowerPC PowerPC (with the backronym Performance Optimization With Enhanced RISC – Performance Computing, sometimes abbreviated as PPC) is a reduced instruction set computer (RISC) instruction set architecture (ISA) created by the 1991 Apple Inc., App ...
, AMD TeraScale, most recent
AMD Advanced Micro Devices, Inc. (AMD) is an American multinational corporation and technology company headquartered in Santa Clara, California and maintains significant operations in Austin, Texas. AMD is a hardware and fabless company that de ...
GPUs (also named ''AMDGPU'' in LLVM documentation), SPARC,
z/Architecture z/Architecture, initially and briefly called ESA Modal Extensions (ESAME), is IBM's 64-bit complex instruction set computer (CISC) instruction set architecture, implemented by its mainframe computers. IBM introduced its first z/Architecture ...
(also named ''SystemZ'' in LLVM documentation), and XCore. Some features are not available on some platforms. Most features are present for IA-32, x86-64, z/Architecture, ARM, and PowerPC.
RISC-V RISC-V (pronounced "risk-five") is an open standard instruction set architecture (ISA) based on established reduced instruction set computer (RISC) principles. The project commenced in 2010 at the University of California, Berkeley. It transfer ...
is supported as of version 7. In the past, LLVM also supported other backends, fully or partially, including C backend, Cell SPU, mblaze (MicroBlaze), AMD R600, DEC/Compaq
Alpha Alpha (uppercase , lowercase ) is the first letter of the Greek alphabet. In the system of Greek numerals, it has a value of one. Alpha is derived from the Phoenician letter ''aleph'' , whose name comes from the West Semitic word for ' ...
( Alpha AXP) and Nios2, but that hardware is mostly obsolete, and LLVM developers decided the support and maintenance costs were no longer justified. LLVM also supports
WebAssembly WebAssembly (Wasm) defines a portable binary-code format and a corresponding text format for executable programs as well as software interfaces for facilitating communication between such programs and their host environment. The main goal of ...
as a target, enabling compiled programs to execute in WebAssembly-enabled environments such as
Google Chrome Google Chrome is a web browser developed by Google. It was first released in 2008 for Microsoft Windows, built with free software components from Apple WebKit and Mozilla Firefox. Versions were later released for Linux, macOS, iOS, iPadOS, an ...
/
Chromium Chromium is a chemical element; it has Symbol (chemistry), symbol Cr and atomic number 24. It is the first element in Group 6 element, group 6. It is a steely-grey, Luster (mineralogy), lustrous, hard, and brittle transition metal. Chromium ...
,
Firefox Mozilla Firefox, or simply Firefox, is a free and open-source web browser developed by the Mozilla Foundation and its subsidiary, the Mozilla Corporation. It uses the Gecko rendering engine to display web pages, which implements curr ...
,
Microsoft Edge Microsoft Edge is a Proprietary Software, proprietary cross-platform software, cross-platform web browser created by Microsoft and based on the Chromium (web browser), Chromium open-source project, superseding Edge Legacy. In Windows 11, Edge ...
, Apple Safari or WAVM. LLVM-compliant WebAssembly compilers typically support mostly unmodified source code written in C, C++, D, Rust, Nim, Kotlin and several other languages. The LLVM machine code (MC) subproject is LLVM's framework for translating machine instructions between textual forms and machine code. Formerly, LLVM relied on the system assembler, or one provided by a toolchain, to translate assembly into machine code. LLVM MC's integrated assembler supports most LLVM targets, including IA-32, x86-64, ARM, and ARM64. For some targets, including the various MIPS instruction sets, integrated assembly support is usable but still in the beta stage.


Linker

The lld subproject is an attempt to develop a built-in, platform-independent
linker Linker or linkers may refer to: Computing * Linker (computing), a computer program that takes one or more object files generated by a compiler or generated by an assembler and links them with libraries, generating an executable program or shar ...
for LLVM. lld aims to remove dependence on a third-party linker. , lld supports
ELF An elf (: elves) is a type of humanoid supernatural being in Germanic peoples, Germanic folklore. Elves appear especially in Norse mythology, North Germanic mythology, being mentioned in the Icelandic ''Poetic Edda'' and the ''Prose Edda'' ...
, PE/COFF, Mach-O, and
WebAssembly WebAssembly (Wasm) defines a portable binary-code format and a corresponding text format for executable programs as well as software interfaces for facilitating communication between such programs and their host environment. The main goal of ...
in descending order of completeness. lld is faster than both flavors of GNU ld. Unlike the GNU linkers, lld has built-in support for link-time optimization (LTO). This allows for faster code generation as it bypasses the use of a linker plugin, but on the other hand prohibits interoperability with other flavors of LTO.


C++ Standard Library

The LLVM project includes an implementation of the
C++ Standard Library The C standard library, sometimes referred to as libc, is the standard library for the C programming language, as specified in the ISO C standard.ISO/ IEC (2018). '' ISO/IEC 9899:2018(E): Programming Languages - C §7'' Starting from the origina ...
named libc++, dual-licensed under the
MIT License The MIT License is a permissive software license originating at the Massachusetts Institute of Technology (MIT) in the late 1980s. As a permissive license, it puts very few restrictions on reuse and therefore has high license compatibility. Unl ...
and the UIUC license. Since v9.0.0, it was relicensed to the Apache License 2.0 with LLVM Exceptions.


Polly

This implements a suite of cache-locality optimizations as well as auto-parallelism and vectorization using a polyhedral model.


Debugger


C Standard Library

llvm-libc is an incomplete, upcoming, ABI independent
C standard library The C standard library, sometimes referred to as libc, is the standard library for the C (programming language), C programming language, as specified in the ISO C standard.International Organization for Standardization, ISO/International Electrote ...
designed by and for the LLVM project.


Derivatives

Due to its permissive license, many vendors release their own tuned forks of LLVM. This is officially recognized by LLVM's documentation, which suggests against using version numbers in feature checks for this reason. Some of the vendors include: * AMD's AMD Optimizing C/C++ Compiler is based on LLVM, Clang, and Flang. * Apple maintains an open-source fork for
Xcode Xcode is a suite of developer tools for building apps on Apple devices. It includes an integrated development environment (IDE) of the same name for macOS, used to develop software for macOS, iOS, iPadOS, watchOS, tvOS, and visionOS. It w ...
. * Arm provides a number of LLVM based toolchains, including Arm Compiler for Embedded targeting bare-metal development and Arm Compiler for Linux targeting the High Performance Computing market * Flang, Fortran project in development *
IBM International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
is adopting LLVM in its C/C++ and Fortran compilers. *
Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, and Delaware General Corporation Law, incorporated in Delaware. Intel designs, manufactures, and sells computer compo ...
has adopted LLVM for their next generation
Intel C++ Compiler Intel oneAPI DPC++/C++ Compiler and Intel C++ Compiler Classic (deprecated icc and icl is in Intel OneAPI HPC toolkit) are Intel’s C, C++, SYCL, and Data Parallel C++ (DPC++) compilers for Intel processor-based systems, available for Wind ...
. * The
Los Alamos National Laboratory Los Alamos National Laboratory (often shortened as Los Alamos and LANL) is one of the sixteen research and development Laboratory, laboratories of the United States Department of Energy National Laboratories, United States Department of Energy ...
has a parallel-computing fork of LLVM 8 named "Kitsune". *
Nvidia Nvidia Corporation ( ) is an American multinational corporation and technology company headquartered in Santa Clara, California, and incorporated in Delaware. Founded in 1993 by Jensen Huang (president and CEO), Chris Malachowsky, and Curti ...
uses LLVM in the implementation of its NVVM
CUDA In computing, CUDA (Compute Unified Device Architecture) is a proprietary parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for accelerated gene ...
Compiler. The NVVM compiler is distinct from the "NVPTX" backend mentioned in the Backends section, although both generate PTX code for Nvidia GPUs. * Since 2013, Sony has been using LLVM's primary front-end Clang compiler in the
software development kit A software development kit (SDK) is a collection of software development tools in one installable package. They facilitate the creation of applications by having a compiler, debugger and sometimes a software framework. They are normally specific t ...
(SDK) of its
PlayStation 4 The PlayStation 4 (PS4) is a home video game console developed by Sony Interactive Entertainment. Announced as the successor to the PlayStation 3 in February 2013, it was launched on November 15, 2013, in North America, November 29, 2013, in ...
console.


See also

*
Common Intermediate Language Common Intermediate Language (CIL), formerly called Microsoft Intermediate Language (MSIL) or Intermediate Language (IL), is the intermediate language binary instruction set defined within the Common Language Infrastructure (CLI) specification. ...
*
HHVM HipHop Virtual Machine (HHVM) is an Open-source software, open-source virtual machine based on Just-in-time compilation, just-in-time (JIT) compilation that serves as an execution engine for the Hack (programming language), Hack programming lang ...
* C-- * Amsterdam Compiler Kit (ACK) * Optimizing compiler * LLDB (debugger) * GNU lightning *
GNU Compiler Collection The GNU Compiler Collection (GCC) is a collection of compilers from the GNU Project that support various programming languages, Computer architecture, hardware architectures, and operating systems. The Free Software Foundation (FSF) distributes ...
(GCC) * Pure *
OpenCL OpenCL (Open Computing Language) is a software framework, framework for writing programs that execute across heterogeneous computing, heterogeneous platforms consisting of central processing units (CPUs), graphics processing units (GPUs), di ...
* ROCm *
Emscripten Emscripten is an LLVM/Clang-based compiler that compiles C and C++ source code to WebAssembly, primarily for execution in web browsers. Emscripten allows applications and libraries written in C or C++ to be compiled ahead of time and run effi ...
* TenDRA Distribution Format * Architecture Neutral Distribution Format (ANDF) *
Comparison of application virtualization software Application virtualization software refers to both application virtual machines and software responsible for implementing them. Application virtual machines are typically used to allow application bytecode to run portably on many different comput ...
* SPIR-V * University of Illinois at Urbana Champaign discoveries & innovations


Literature

*
Chris Lattner Christopher Arthur Lattner (born 1978) is an American software engineer and creator of LLVM, the Clang compiler, the Swift (programming language), Swift programming language and the MLIR (software), MLIR compiler infrastructure. After his PhD ...
-
The Architecture of Open Source Applications - Chapter 11 LLVM
', , released 2012 under
CC BY A Creative Commons (CC) license is one of several public copyright licenses that enable the free distribution of an otherwise copyrighted "work". A CC license is used when an author wants to give other people the right to share, use, and bui ...
3.0 (
Open Access Open access (OA) is a set of principles and a range of practices through which nominally copyrightable publications are delivered to readers free of access charges or other barriers. With open access strictly defined (according to the 2001 de ...
).
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation
a published paper by Chris Lattner, Vikram Adve


References


External links

* {{Use mdy dates, date=October 2018 Free and open source compilers Register-based virtual machines Software using the University of Illinois/NCSA Open Source License Software using the Apache license