#
.
Although C++ source files are often named with a .cpp
extension, that is an abbreviation for "C plus plus"; not C preprocessor.
Preprocessor directives
The following languages have the following accepted directives.C/C++
The following tokens are recognised by the preprocessor in the context of preprocessor directives. *#if
* #elif
* #else
* #endif
* #ifdef
* #ifndef
* #elifdef
* #elifndef
* #define
* #undef
* #include
* #embed
* #line
* #error
* #warning
* #pragma
* defined
(follows a conditional directive; not actually a directive, but rather an operator)
* #__has_include
* #__has_cpp_attribute
(C++ only)
* #__has_embed
Until C++26, the C++ keywords import
, export
, and module
were partially handled by the preprocessor as well.
Haskell also accepts C preprocessor directives.
C#
Although C# does not have a separate preprocessor, these directives are processed as if there were one. *#nullable
* #if
* #elif
* #else
* #endif
* #define
* #undef
* #region
* #endregion
* #error
* #warning
* #line
* #pragma
C# does not use a preprocessor to handle these directives, and thus they are not handled or removed by a preprocessor, but rather directly read by the C# compiler as a feature of the language.
Objective-C
The following tokens are recognised by the preprocessor in the context of preprocessor directives. *#if
* #elif
* #else
* #endif
* #ifdef
* #ifndef
* #define
* #undef
* #include
* #import
* #error
* #pragma
* defined
History
The preprocessor was introduced to C around 1973 at the urging of Alan Snyder and also in recognition of the usefulness of the file inclusion mechanisms available in BCPL and PL/I. The first version offered file inclusion via#include
and parameterless string replacement macros via #define
. It was extended shortly after, firstly by Mike Lesk and then by John Reiser, to add arguments to macros and to support conditional compilation.
The C preprocessor was part of a long macro-language tradition at Bell Labs, which was started by Douglas Eastwood and Douglas McIlroy in 1959.
Phases
Preprocessing is defined by the first four (of eight) ''phases of translation'' specified in the C Standard. # Trigraph replacement: The preprocessor replaces trigraph sequences with the characters they represent. This phase was removed in C23 following the steps of C++17. # Line splicing: Physical source lines that are continued with escaped newline sequences are ''spliced'' to form logical lines. # Tokenization: The preprocessor breaks the result into ''preprocessing tokens'' and whitespace. It replaces comments with whitespace. # Macro expansion and directive handling: Preprocessing directive lines, including file inclusion and conditional compilation, are executed. The preprocessor simultaneously expands macros and, since the 1999 version of the C standard, handles_Pragma
operators.
Features
File inclusion
There are two directives in the C preprocessor for including contents of files: *#include
, used for directly including the contents of a file in-place (typically containing code of some kind)
* #embed
, used for directly including or embedding the contents of a binary resource in-place
Code inclusion
To include the content of one file into another, the preprocessor replaces a line that starts with#include
with the content of the file specified after the directive. The inclusion may be logical in the sense that the resulting content may not be stored on disk and certainly is not overwritten to the source file. The file being included need not contain any sort of code, as this directive will copy the contents of whatever file is included in-place, but the most typical use of #include
is to include a header file (or in some rarer cases, a source file).
In the following example code, the preprocessor replaces the line #include <stdio.h>
with the content of the standard library header file named '' in which the function printf()
and other symbols are declared.
.h
extension. In C++, the convention for file extension varies with common extensions .h
and .hpp
. But the preprocessor includes a file regardless of the extension. In fact, sometimes code includes .c
or .cpp
files.
To prevent including the same file multiple times which often leads to a compiler error, a header file typically contains an guard or if supported by the preprocessor to prevent multiple inclusion.
Binary resource inclusion
C23 and C++26 introduce the#embed
directive for binary resource inclusion which allows including the content of a binary file into a source even though it's not valid C code.
This allows binary resources (like images) to be included into a program without requiring processing by external tools like xxd -i
and without the use of string literals which have a length limit on MSVC. Similarly to xxd -i
the directive is replaced by a comma separated list of integers corresponding to the data of the specified resource. More precisely, if an array of type is initialized using an #embed
directive, the result is the same as-if the resource was written to the array using fread
(unless a parameter changes the embed element width to something other than CHAR_BIT
). Apart from the convenience, #embed
is also easier for compilers to handle, since they are allowed to skip expanding the directive to its full form due to the as-if rule.
The file to embed is specified the same as for #include
either with limit
parameter is used to limit the width of the included data. It is mostly intended to be used with "infinite" files like urandom. The prefix
and suffix
parameters allow for specifying a prefix and suffix to the embedded data. Finally, the if_empty
parameter replaces the entire directive if the resource is empty. All standard parameters can be surrounded by double underscores, just like standard attributes on C23, for example __prefix__
is interchangeable with prefix
. Implementation-defined parameters use a form similar to attribute syntax (e.g., vendor::attr
) but without the square brackets. While all standard parameters require an argument to be passed to them (e.g., limit requires a width), this is generally optional and even the set of parentheses can be omitted if an argument is not required, which might be the case for some implementation-defined parameters.
Conditional compilation
Conditional compilation is supported via the if-else core directives#if
, #else
, #elif
, and #endif
and with contraction directives #ifdef
and #ifndef
which stand for and , respectively. In the following example code, the printf()
call is only included for compilation if VERBOSE
is defined.
Macro string replacement
;Object-like A macro specifies how to replace text in the source code with other text. An ''object-like'' macro defines a token that the preprocessor replaces with other text. It does not include parameter syntax and therefore cannot support parameterization. The following macro definition associates the text "1 / 12" with the token "VALUE":ADD(VALUE, 2)
expands to 1 / 12 + 2
.
;Variadic
A variadic macro (introduced with C99) accepts a varying number of arguments which is particularly useful when wrapping functions that accept a variable number of parameters, such as printf
.
;Order of expansion
''Function-like'' macro expansion occurs in the following stages:
# Stringification operations are replaced with the textual representation of their argument's replacement list (without performing expansion).
# Parameters are replaced with their replacement list (without performing expansion).
# Concatenation operations are replaced with the concatenated result of the two operands (without expanding the resulting token).
# Tokens originating from parameters are expanded.
# The resulting tokens are expanded as normal.
This may produce surprising results:
Undefine macro
A macro definition can be removed from the preprocessor context via#undef
such that subsequent reference to the macro token will not expand. For example:
Predefined macros
The preprocessor provides some macro definitions automatically. The C standard specifies that__FILE__
expands to the name of the file being processed and __LINE__
expands to the number of the line that contains the directive. The following macro, DEBUGPRINT
, formats and prints a message with the file name and line number.
util.c
and for count 123, the output is: til.c:30 count=123
.
__STDC__
expand to "1" if the implementation conforms to the ISO standard and "0" otherwise and that __STDC_VERSION__
expand to a numeric literal specifying the version of the standard supported by the implementation. Standard C++ compilers support the __cplusplus
macro. Compilers running in non-standard mode must not set these macros or must define others to signal the differences.
Other standard macros include __DATE__
, the current date, and __TIME__
, the current time.
The second edition of the C Standard, C99, added support for __func__
, which contains the name of the function definition within which it is contained, but because the preprocessor is agnostic to the grammar of C, this must be done in the compiler itself using a variable local to the function.
One little-known usage pattern of the C preprocessor is known as X-Macros.Wirzenius, Lars. C "Preprocessor Trick For Implementing Similar Data Types". Retrieved January 9, 2011.def
instead of the traditional .h
. This file contains a list of similar macro calls, which can be referred to as "component macros." The include file is then referenced repeatedly.
Many compilers define additional, non-standard macros. A common reference for these macros is th_WIN32
. This allows code, including preprocessor commands, to compile only when targeting Windows systems. A few compilers define WIN32
instead. For such compilers that do not implicitly define the _WIN32
macro, it can be specified on the compiler's command line, using -D_WIN32
.
__unix__
is defined. If it is, the file <unistd.h>
is then included. Otherwise, it tests if a macro _WIN32
is defined instead. If it is, the file <windows.h>
is then included.
Line control
The values of the predefined macros__FILE__
and __LINE__
can be set for a subsequent line via the #line
directive. In the code below, __LINE__
expands to 314 and __FILE__
to "pi.c".
Operators
The preprocessor is capable of interpreting operators and evaluating very basic expressions, such as integer constants, arithmetic operators, comparison operators, logical operators, bitwise operations, thedefined
operator, and the #
stringificafion operator. This allows the preprocessor to make evaluations such as:
Defined operator
While the defined operator, denoted bydefined
is not a directive in its own right, if it is read within a directive, it is interpreted by the preprocessor and determines whether a macro has been defined.
The following are both accepted ways of invoking the defined
operator.
Token stringification operator
The stringification operator (a.k.a. stringizing operator), denoted by#
converts a token into a string literal, escaping any quotes or backslashes as needed. For definition:
str(\n)
expands to "\n"
and str(p = "foo\n";)
expands to "p = \"foo\\n\";"
.
If stringification of the expansion of a macro argument is desired, two levels of macros must be used. For definition:
str(foo)
expands to "foo" and xstr(foo)
expands to "4".
A macro argument cannot be combined with additional text and then stringified. However, a series of adjacent string literals and stringified arguments, also string literals, are concatenated by the C compiler.
Token concatenation
The token pasting operator, denoted by##
, concatenates two tokens into one. For definition:
DECLARE_STRUCT_TYPE(g_object)
expands to typedef struct g_object_s g_object_t
.
Abort
Processing can be aborted via the#error
directive. For example:
Warning
As of C23 and C++23, a warning directive,#warning
, to print a message without aborting is provided. Some typical uses are to warn about the use of deprecated functionality. For example:
Prior to C23 and C++23, this directive existed in many compilers as a non-standard feature, such as the C compilers by GNU, Intel, Microsoft and IBM. Because it was non-standard, the warning macro had varying forms:
Non-standard features
Pragma
The#pragma
directive is defined by standard languages, but with little or no requirements for syntax after its name so that compilers are free to define subsequent syntax and associated behavior. For instance, a pragma is often used to allow suppression of error messages, manage heap and stack debugging and so on.
C99 introduced a few standard pragmas, taking the form #pragma STDC ...
, which are used to control the floating-point implementation. The alternative, macro-like form was also added.
One of the most popular uses of the #pragma
directive is , which behaves the same way an guard would, condensed into a single directive placed at the top of the file. Despite being non-standard, it is supported by most compilers.
Trigraphs
Many implementations do not support trigraphs or do not replace them by default.Assertion
Some Unix preprocessors provided an assertion feature which has little similarity to standard library assertions.Include next
GCC provides#include_next
for chaining headers of the same name.
Import
Unlike C and C++, Objective-C includes an#import
directive that is like #include
but results in a file being included only once eliminating the need for include guards and #pragma once
.
#import
preprocessor directive, used to import type libraries. It is a nonstandard directive.
import
, which is used to import C++ modules (since C++20), and is not a preprocessor directive.
Nullable
The#nullable
directive in C# is used to enable and disable nullable reference types. To enable them, use #nullable enable
, and #nullable disable
to disable them.
Region
The#region
and #endregion
directives in C# are used to expand/collapse sections of code in IDEs, and has no effect on actual compilation of the program. It is primarily used for code organisation and readability.
Other uses
Traditionally, the C preprocessor was a separate development tool from the compiler with which it is usually used. In that case, it can be used separately from the compiler. Notable examples include use with the (deprecated) imake system and for preprocessing Fortran. However, use as a general purpose preprocessor is limited since the source code language must be relatively C-like for the preprocessor to parse it. The GNU Fortran compiler runs "traditional mode" CPP before compiling Fortran code if certain file extensions are used. Intel offers a Fortran preprocessor, fpp, for use with the ifort compiler, which has similar capabilities. CPP also works acceptably with most assembly languages and Algol-like languages. This requires that the language syntax not conflict with CPP syntax, which means no lines starting with#
and that double quotes, which CPP interprets as string literals and thus ignores, don't have syntactical meaning other than that. The "traditional mode" (acting like a pre-ISO C preprocessor) is generally more permissive and better suited for such use.
Some modern compilers such as the GNU C Compiler provide preprocessing as a feature of the compiler; not as a separate tool.
Limitations
Text substitution limitations
Text substitution has a relatively high risk of causing a software bug as compared to other programming constructs. ;Hidden multiple evaluation Consider the common definition of a ''max'' macro:int i = 1; j = 2;
, the result of max(i,j)
is 2. If ''a'' and ''b'' were only evaluated once, the result of max(i++,j++)
would be the same, but with double evaluation the result is 3.
;Hidden order of operation
Failure to bracket arguments can lead to unexpected results. For example, a macro to double a value might be written as:
double(1 + 2)
expands to 2 * 1 + 2
which due to order of operations, evaluates to 4 when the expected is 6. To mitigate this problem, a macro should bracket all expressions and substitution variables:
Not general purpose
The C preprocessor is not Turing-complete, but comes close. Recursive computations can be specified, but with a fixed upper bound on the amount of recursion performed. However, the C preprocessor is not designed to be, nor does it perform well as, a general-purpose programming language. As the C preprocessor does not have features of some other preprocessors, such as recursive macros, selective expansion according to quoting, and string evaluation in conditionals, it is very limited in comparison to a more general macro processor such as m4.Phase out
Due to its limitations and lack of type safety (as the preprocessor is completely oblivious to C/C++ grammar, performing only text substitutions), C and C++ language features have been added over the years to minimize the value and need for the preprocessor. ;Constant For a long time, a preprocessor macro provided the preferred way to define a constant value. An alternative has always been to define aconst
variable, but that results in consuming runtime memory. A newer language construct (since C++11 and C23), constexpr
allows for declaring a compile-time constant value that need not consume runtime memory.
;Inline function
For a long time, a function-like macro was the only way to define function-like behavior that did not incur runtime function call overhead. Via the inline
keyword and optimizing compilers that inline automatically, some functions can be invoked without call overhead.
;Import
The include directive limits code structure since it only allows including the content of one file into another. More modern languages support a module concept that has public symbols that other modules import instead of including file content. Many contend that resulting code has reduced boilerplate and is easier to maintain since there is only one file for a module; not both a header and a body. C++20 adds modules, and an import
statement that is not handled via preprocessing. Modules in C++ compile faster and link faster than traditional headers, and eliminate the necessity of guards or . Until C++26, import
, export
, and module
keywords were partially handled by the preprocessor.
For code bases that cannot migrate to modules immediately, C++ also offers "header units" as a feature, which allows header files to be imported in the same way a module would. Unlike modules, header units may emit macros, offering minimal breakage between migration. Header units are designed to be a transitional solution before totally migrating to modules. For instance, one may write See also
* C syntax * C++ syntax * C# syntax * Make * m4 (computer language) * PL/I preprocessorReferences
Sources
* *External links