MLIR (Multi-Level Intermediate Representation) is an open-source

compiler In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...

infrastructure project developed as a sub-project of the

LLVM LLVM, also called LLVM Core, is a target-independent optimizer and code generator. It can be used to develop a Compiler#Front end, frontend for any programming language and a Compiler#Back end, backend for any instruction set architecture. LLVM i ...

project. It provides a modular and extensible

intermediate representation An intermediate representation (IR) is the data structure or code used internally by a compiler or virtual machine to represent source code. An IR is designed to be conducive to further processing, such as optimization and translation. A "good" ...

(IR) framework intended to facilitate the construction of domain-specific compilers and improve compilation for

heterogeneous computing Heterogeneous computing refers to systems that use more than one kind of processor or core. These systems gain performance or energy efficiency not just by adding the same type of processors, but by adding dissimilar coprocessors, usually incor ...

platforms. MLIR supports multiple abstraction levels in a single IR and introduces

dialects A dialect is a variety of language spoken by a particular group of people. This may include dominant and standardized varieties as well as vernacular, unwritten, or non-standardized varieties, such as those used in developing countries or iso ...

, a mechanism for defining custom operations, types, and attributes tailored to specific domains. The name "Multi-Level Intermediate Representation" reflects the system’s ability to model computations at various abstraction levels and progressively lower them toward

machine code In computer programming, machine code is computer code consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). For conventional binary computers, machine code is the binaryOn nonb ...

. MLIR was originally developed in 2018 by

Chris Lattner Christopher Arthur Lattner (born 1978) is an American software engineer and creator of LLVM, the Clang compiler, the Swift (programming language), Swift programming language and the MLIR (software), MLIR compiler infrastructure. After his PhD ...

Google Google LLC (, ) is an American multinational corporation and technology company focusing on online advertising, search engine technology, cloud computing, computer software, quantum computing, e-commerce, consumer electronics, and artificial ...

, and publicly released as part of LLVM in 2019. It was designed to address challenges in building compilers for modern workloads such as

machine learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...

, hardware acceleration, and

high-level synthesis High-level synthesis (HLS), sometimes referred to as C synthesis, electronic system-level (ESL) synthesis, algorithmic synthesis, or behavioral synthesis, is an automated design process that takes an abstract behavioral specification of a digital ...

by providing reusable components and standardizing the representation of intermediate computations across different programming languages and hardware targets. MLIR is used in a range of systems including

TensorFlow TensorFlow is a Library (computing), software library for machine learning and artificial intelligence. It can be used across a range of tasks, but is used mainly for Types of artificial neural networks#Training, training and Statistical infer ...

, Mojo, TPU-MLIR, and others. It is released under the Apache License 2.0 with LLVM exceptions and is maintained as part of the LLVM project.

History

Work on MLIR began in 2018, led by

in collaboration with Mehdi Amini, River Riddle, and others, as a response to the growing complexity of modern compiler toolchains. The project aimed to improve the modularity, composability, and maintainability of compiler infrastructures, particularly in domains such as

, and hardware acceleration. It was formally introduced at the 2019 LLVM Developer Meeting and was open-sourced later that year as part of the LLVM monorepository. MLIR’s architecture was shaped by prior experiences building compilers such as XLA and

, where limitations in existing intermediate representations hindered optimization and reuse across abstraction levels. To address this, MLIR introduced a novel concept of multi-level IRs that could coexist in the same system and be gradually lowered through well-defined transformations. A foundational design feature was the use of

, allowing different domains and hardware targets to define custom operations and type systems while maintaining interoperability. Since its release, MLIR has been adopted by multiple compiler ecosystems and research efforts. In

, MLIR serves as the foundation for rewriting and lowering transformations in components such as XLA and

Runtime. The language Mojo, developed by Modular Inc., relies on MLIR to achieve ahead-of-time compilation for artificial intelligence workloads. Additional projects that have built on MLIR include TPU-MLIR for compiling models to Tensor Processing Unit hardware, ONNX-MLIR for interoperable machine learning models, MLIR-AIE for targeting

Xilinx Xilinx, Inc. ( ) was an American technology and semiconductor company that primarily supplied programmable logic devices. The company is renowned for inventing the first commercially viable field-programmable gate array (FPGA). It also pioneered ...

AI Engines, IREE for compiling and executing machine learning models across CPUs, GPUs, and accelerators, DSP-MLIR, a compiler infrastructure tailored for digital signal processing (DSP) applications, and torch-mlir, which brings MLIR-based compilation capabilities to the PyTorch ecosystem. MLIR continues to evolve as part of the LLVM Project and follows the project's release schedule and development policies. It is developed collaboratively by contributors from industry, academia, and the broader open-source community.

Dialects

In MLIR, a dialect defines a self-contained namespace of operations, types, attributes, and other constructs. Dialects are the primary mechanism for extensibility, allowing developers to introduce domain-specific abstractions while maintaining compatibility within the broader MLIR framework. Each operation within a dialect is identified by a unique name and may include optional operands, results, attributes, and regions. Operands and results follow the static single-assignment form (SSA), and each result is associated with a type. Attributes represent compile-time metadata, such as constant values. Regions consist of ordered blocks, each of which may take input arguments and contain a sequence of nested operations. While MLIR is designed around SSA, it avoids traditional PHI nodes by using block arguments in conjunction with the operands of control-flow operations to model value merging. The general syntax for an operation is the following: %res:2 = "mydialect.morph"(%input#3) () : (!mydialect<"custom_type">) -> (!mydialect<"other_type">, !mydialect<"other_type">) loc(callsite("foo" at "mysource.cc":10:8)) This operation, named morph, belongs to the mydialect dialect. It takes one input operand (%input#3) of type custom_type and produces two output values of type other_type. The operation includes two attributes—some.attribute and other_attribute—and contains a region with a single block (^bb0) that accepts one argument. The loc keyword specifies source-level location information, which can be used for

debugging In engineering, debugging is the process of finding the Root cause analysis, root cause, workarounds, and possible fixes for bug (engineering), bugs. For software, debugging tactics can involve interactive debugging, control flow analysis, Logf ...

or diagnostic reporting. The syntax of operations, types and attributes can also be customized according to the user preferences by implementing proper

parsing Parsing, syntax analysis, or syntactic analysis is a process of analyzing a String (computer science), string of Symbol (formal), symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal gramm ...

and printing functions within the operation definition.

Core dialects

The MLIR dialects ecosystem is open and extensible, allowing end-users to define new dialects that capture the semantics of specific computational domains. At the same time, the MLIR codebase provides a variety of built-in dialects that address common patterns found in

s. These core dialects are designed to be self-contained and interoperable, making them suitable for reuse across different

stacks. For example, the arith dialect includes basic mathematical operations over

integers An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...

and

floating-point In computing, floating-point arithmetic (FP) is arithmetic on subsets of real numbers formed by a ''significand'' (a Sign (mathematics), signed sequence of a fixed number of digits in some Radix, base) multiplied by an integer power of that ba ...

types, while the memref dialect provides operations for memory allocation and access. Control-flow abstractions are handled by dialects such as affine, which supports

affine Affine may describe any of various topics concerned with connections or affinities. It may refer to: * Affine, a Affinity_(law)#Terminology, relative by marriage in law and anthropology * Affine cipher, a special case of the more general substi ...

loop nests suitable for polyhedral

optimization Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criteria, from some set of available alternatives. It is generally divided into two subfiel ...

, and scf, which provides

structured Structuring, also known as smurfing in banking jargon, is the practice of executing financial transactions such as making bank deposits in a specific pattern, calculated to avoid triggering financial institutions to file reports required by law ...

control flow In computer science, control flow (or flow of control) is the order in which individual statements, instructions or function calls of an imperative program are executed or evaluated. The emphasis on explicit control flow distinguishes an '' ...

using constructs like for, if, and while. The func dialect supports function definitions and calls, while the gpu dialect introduces primitives for GPU programming models. Additionally, the tosa dialect defines a portable and quantization-friendly operator set for

inference Inferences are steps in logical reasoning, moving from premises to logical consequences; etymologically, the word '' infer'' means to "carry forward". Inference is theoretically traditionally divided into deduction and induction, a distinct ...

. Finally, the llvm dialect provides a one-to-one mapping to LLVM IR, enabling seamless lowering to LLVM’s backend and reuse of its optimization and code generation infrastructure. The following code defines a function that takes two floating point matrices and performs the sum between the values at the same positions: func.func @matrix_add(%arg0: memref<10x20xf32>, %arg1: memref<10x20xf32>) -> memref<10x20xf32> Although different dialects may be used to express similar computations, the level of abstraction and the intended compilation flow may vary. In the example above, the affine dialect enables polyhedral analysis and optimizations, while the memref and arith dialects express memory and arithmetic operations, respectively.

Operation definition specification

The operations of a dialect can be defined using the C++ language, but also in a more convenient and robust way by using the Operation definition specification (ODS). By using TableGen, the C++ code for declarations and definitions can be then automatically generated. The autogenerated code can include parsing and printing methods – which are based on a simple string mapping the structure of desired textual representation – together with all the

boilerplate code In computer programming, boilerplate code, or simply boilerplate, are sections of code that are repeated in multiple places with little to no variation. When using languages that are considered ''verbose'', the programmer must write a lot of boile ...

for accessing fields and perform common actions such verification of the semantics of each operation,

canonicalization In computer science, canonicalization (sometimes standardization or Normalization (statistics), normalization) is a process for converting data that has more than one possible representation into a "standard", "normal", or canonical form. This ...

or folding. The same declaration mechanism can be used also for types and attributes, which are the other two categories of elements constituting a dialect. The following example illustrates how to specify the ''assembly format'' of an operation expecting a variadic number of operands and producing zero results. The textual representation consists in the optional list of attributes, followed by the optional list of operands, a colon, and types of the operands. let assemblyFormat = "attr-dict ($operands^ `:` type($operands))?";

Transformations

Transformations can always be performed directly on the IR, without having to rely on built-in coordination mechanisms. However, in order to ease both implementation and maintenance, MLIR provides an infrastructure for IR rewriting that is composed by different rewrite drivers. Each driver receives a set of objects named ''patterns'', each of which has its own internal logic to match operations with certain properties. When an operation is matched, the rewrite process is performed and the IR is modified according to the logic within the pattern.

Dialect conversion driver

This driver operates according to the ''legality'' of existing operations, meaning that the driver receives a set of rules determining which operations have to be considered ''illegal'' and expects the patterns to match and convert them into ''legal'' ones. The logic behind those rules can be arbitrarily complex: it may be based just on the dialect to which the operations belong, but can also inspect more specific properties such as attributes or nested operations. As the names suggests, this driver is typically used for converting the operations of a dialect into operations belonging to a different one. In this scenario, the whole source dialect would be marked as illegal, the destination one as legal, and patterns for the source dialect operations would be provided. The dialect conversion framework also provides support for type conversion, which has to be performed on operands and results to convert them to the

type system In computer programming, a type system is a logical system comprising a set of rules that assigns a property called a ''type'' (for example, integer, floating point, string) to every '' term'' (a word, phrase, or other set of symbols). Usu ...

of the destination dialect. MLIR allows for multiple conversion paths to be taken. Considering the example about the sum of matrices, a possible lowering strategy may be to generate for-loops belonging to the ''scf'' dialect, obtaining code to be executed on

CPUs A central processing unit (CPU), also called a central processor, main processor, or just processor, is the primary Processor (computing), processor in a given computer. Its electronic circuitry executes Instruction (computing), instructions ...

: #map = affine_map<(d0, d1) -> (d0, d1)> module Another possible strategy, however, could have been to use the ''gpu'' dialect to generate code for GPUs: #map = affine_map<(d0, d1) -> (d0, d1)> module

Greedy pattern rewrite driver

The driver greedily applies the provided patterns according to their benefit, until a fixed point is reached or the maximum number of iterations is reached. The benefit of a pattern is self-attributed. In case of equalities, the relative order within the patterns list is used.

Traits and interfaces

MLIR allows to apply existing optimizations (e.g., common subexpression elimination,

loop-invariant code motion In computer programming, loop-invariant code consists of statements or expressions (in an imperative programming, imperative programming language) that can be moved outside the body of a loop without affecting the semantics of the program. Loop-i ...

) on custom dialects by means of traits and interfaces. These two mechanisms enable transformation passes to operate on operations without knowing their actual implementation, relying only on some properties that traits or interfaces provide. Traits are meant to be attached to operations without requiring any additional implementation. Their purpose is to indicate that the operation satisfies certain properties (e.g. having exactly two operands). Interfaces, instead, represent a more powerful tool through which the operation can be queried about some specific aspect, whose value may change between instances of the same kind of operation. An example of interface is the representation of memory effects: each operation that operates on memory may have such interface attached, but the actual effects may depend on the actual operands (e.g., a function call with arguments possibly being constants or references to memory).

Applications

The freedom in modeling intermediate representations enables MLIR to be used in a wide range of scenarios. This includes traditional programming languages, but also

quantum computing A quantum computer is a computer that exploits quantum mechanical phenomena. On small scales, physical matter exhibits properties of wave-particle duality, both particles and waves, and quantum computing takes advantage of this behavior using s ...

and

homomorphic encryption Homomorphic encryption is a form of encryption that allows computations to be performed on encrypted data without first having to decrypt it. The resulting computations are left in an encrypted form which, when decrypted, result in an output th ...

Machine learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...

applications also take advantage of built-in polyhedral compilation techniques, together with dialects targeting accelerators and other heterogeneous systems.

References

External links

* * {{GitHub, llvm/mlir-www, MLIR WWW, source code
Code documentation
Free and open source compilers Software using the University of Illinois/NCSA Open Source License Software using the Apache license