In
C and
C++ programming language terminology, a translation unit (or more casually a compilation unit) is the ultimate input to a C or C++
compiler
In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
from which an
object file
An object file is a file that contains machine code or bytecode, as well as other data and metadata, generated by a compiler or assembler from source code during the compilation or assembly process. The machine code that is generated is kno ...
is generated.
ISO/IEC 9899:TC3 - Committee Draft of the C99 Standard - Section 5.1.1.1
/ref> A translation unit roughly consists of a source file
In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer.
Since a computer, at base, onl ...
after it has been processed by the C preprocessor, meaning that header file
An include directive instructs a text file processor to replace the directive text with the content of a specified file.
The act of including may be logical in nature. The processor may simply process the include file content at the location of ...
s listed in #include
directives are literally included, sections of code within #ifndef
may be included, and macros have been expanded. A C++ module is also a translation unit.
Context
A C program consists of ''units'' called '' source files'' (or ''preprocessing files''), which, in addition to source code, includes directives for the C preprocessor. A translation unit is the output of the C preprocessor – a source file after it has been preprocessed.
Preprocessing notably consists of expanding a source file to recursively replace all #include
directives with the literal file declared in the directive (usually header file
An include directive instructs a text file processor to replace the directive text with the content of a specified file.
The act of including may be logical in nature. The processor may simply process the include file content at the location of ...
s, but possibly other source files); the result of this step is a ''preprocessing translation unit''. Further steps include macro expansion of #define
directives, and conditional compilation of #ifdef
directives, among others; this translates the preprocessing translation unit into a ''translation unit''. From a translation unit, the compiler generates an object file
An object file is a file that contains machine code or bytecode, as well as other data and metadata, generated by a compiler or assembler from source code during the compilation or assembly process. The machine code that is generated is kno ...
, which can be further processed and linked (possibly with other object files) to form an ''executable program''.
Note that the preprocessor is in principle language agnostic, and is a lexical preprocessor, working at the lexical analysis
Lexical tokenization is conversion of a text into (semantically or syntactically) meaningful ''lexical tokens'' belonging to categories defined by a "lexer" program. In case of a natural language, those categories include nouns, verbs, adjectives ...
level – it does not do parsing, and thus is unable to do any processing specific to C syntax. The input to the compiler is the translation unit, and thus it does not see any preprocessor directives, which have all been processed before compiling starts. While a given translation unit is fundamentally based on a file, the actual source code fed into the compiler may appear substantially different than the source file that the programmer views, particularly due to the recursive inclusion of headers.
Scope
Translation units define a scope, roughly file scope, and functioning similarly to module scope; in C terminology this is referred to as internal linkage, which is one of the two forms of linkage in C. Names (functions and variables) declared outside of a function block may be visible either only within a given translation unit, in which case they are said to have internal linkage – they are not visible to the linker – or may be visible to other object files, in which case they are said to have external linkage, and are visible to the linker.
C does not have a notion of modules. However, separate object files (and hence also the translation units used to produce object files) function similarly to separate modules, and if a source file does not include other source files, internal linkage (translation unit scope) may be thought of as "file scope, including all header files".
Code organization
The bulk of a project's code is typically held in files with a .c
suffix (or .cpp
, .cxx
or .cc
for C++, of which .cpp
is used most conventionally) and among C++ modules, the extensions are .cppm
, .ixx
, or .mxx
(in which the most popular extension is .cppm
or .ixx
). Files intended to be included typically have a .h
suffix ( .hpp
or .hh
are also used for C++, but .h
is the most common even for C++), and generally do not contain function or variable definitions to avoid name conflicts when headers are included in multiple source files, as is often the case. Header files can be, and often are, included in other header files. It is standard practice for all .c
files in a project to include at least one .h
file.
See also
* Single compilation unit
References
{{reflist
C (programming language)
Compiler construction