A multi-pass compiler is a type of
compiler
In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
that processes the
source code
In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer.
Since a computer, at base, only ...
or
abstract syntax tree
An abstract syntax tree (AST) is a data structure used in computer science to represent the structure of a program or code snippet. It is a tree representation of the abstract syntactic structure of text (often source code) written in a formal ...
of a program several times. This is in contrast to a
one-pass compiler
In computer programming, a one-pass compiler is a compiler that processes each compilation unit only once, sequentially translating each source statement or declaration into something close to its final machine code. This is in contrast to a mul ...
, which traverses the program only once. Each pass takes the result of the previous pass as the input, and creates an intermediate output. In this way, the (intermediate) code is improved pass by pass, until the final pass produces the final code.
Multi-pass compilers are sometimes called wide compilers,
referring to the greater scope of the passes: they can "see" the entire program being compiled, instead of just a small portion of it. The wider scope thus available to these compilers allows better
code generation (e.g. smaller code size, faster code) compared to the output of one-pass compilers, at the cost of higher compiler time and memory consumption. In addition, some
languages
Language is a structured system of communication that consists of grammar and vocabulary. It is the primary means by which humans convey meaning, both in spoken and signed forms, and may also be conveyed through writing. Human language is ch ...
cannot be compiled in a single pass, as a result of their design.
Typical multi-pass compiler
Lexical analysis
This stage of a multi-pass compiler is to remove irrelevant information from the source program that syntax analysis will not be able to use or interpret. Irrelevant information could include things like comments and white space. In addition to removing the irrelevant information, the lexical analysis determines the lexical tokens of the language. This step means that
forward declaration
In computer programming, a forward declaration is a declaration of an identifier (denoting an entity such as a type, a variable, a constant, or a function) for which the programmer has not yet given a complete definition.
It is required for a com ...
is generally not necessary if a multi-pass compiler is used.
This phase is focused on breaking a sequence of characters into tokens with attributes such as kind, type, value, and potentially others, as well.
Syntax analysis
Syntax analysis is responsible for looking at syntax rules of the language (often as a
context-free grammar
In formal language theory, a context-free grammar (CFG) is a formal grammar whose production rules
can be applied to a nonterminal symbol regardless of its context.
In particular, in a context-free grammar, each production rule is of the fo ...
), and building some intermediate representation of the language. An example of this intermediate representation could be something like an
abstract syntax tree
An abstract syntax tree (AST) is a data structure used in computer science to represent the structure of a program or code snippet. It is a tree representation of the abstract syntactic structure of text (often source code) written in a formal ...
or a
directed acyclic graph
In mathematics, particularly graph theory, and computer science, a directed acyclic graph (DAG) is a directed graph with no directed cycles. That is, it consists of vertices and edges (also called ''arcs''), with each edge directed from one ...
.
Semantic analysis
Semantic analysis takes the representation made from syntax analysis and applies semantic rules to the representation to make sure that the program meets the semantic rules requirements of the language. For example, in the example below at the stage of semantic analysis if the language required that conditions on ''if'' statements were boolean expressions the ''cond'' would be type-checked to make sure it would be a valid boolean expression.
if (cond) else
In addition to performing semantic analysis at this stage of compilation, often
symbol tables are created in order to assist in code generation.
Code generation
This final stage of a typical compiler converts the intermediate representation of program into an executable set of instructions (often
assembly). This last stage is the only stage in compilation that is machine dependent. There can also be
optimization
Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criteria, from some set of available alternatives. It is generally divided into two subfiel ...
done at this stage of compilation that make the program more efficient.
Other passes of compiler include intermediate code generation phase which takes place before code generation and code optimization phase which can take place when the source program is written, or after intermediate code generation phase, or after code generation phase.
Advantages of multi-pass compilers
Machine Independent: Since the multiple passes include a modular structure, and the code generation decoupled from the other steps of the compiler, the passes can be reused for different hardware/machines.
More Expressive Languages: Multiple passes obviate the need for forward declarations, allowing mutual recursion to be implemented elegantly. The prime examples of languages requiring forward declarations due to the requirement of being compilable in a single pass include
C and
Pascal, whereas
Java
Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
does not have forward declarations.
References
*
Bornat, Richard''Understanding and Writing Compilers: A Do It Yourself Guide'' Macmillan Publishing, 1979.
* Bent Thomsen, ''Languages and Compilers SProg og Overseattere'', Department of Computer Science, Aalborg University
{{refend
Compilers