Name Decoration
   HOME

TheInfoList



OR:

In
compiler In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
construction, name mangling (also called name decoration) is a technique used to solve various problems caused by the need to resolve unique names for programming entities in many modern
programming language A programming language is a system of notation for writing computer programs. Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...
s. It provides means to encode added information in the name of a function,
structure A structure is an arrangement and organization of interrelated elements in a material object or system, or the object or system so organized. Material structures include man-made objects such as buildings and machines and natural objects such as ...
,
class Class, Classes, or The Class may refer to: Common uses not otherwise categorized * Class (biology), a taxonomic rank * Class (knowledge representation), a collection of individuals or objects * Class (philosophy), an analytical concept used d ...
or another
data type In computer science and computer programming, a data type (or simply type) is a collection or grouping of data values, usually specified by a set of possible values, a set of allowed operations on these values, and/or a representation of these ...
, to pass more semantic information from the
compiler In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
to the
linker Linker or linkers may refer to: Computing * Linker (computing), a computer program that takes one or more object files generated by a compiler or generated by an assembler and links them with libraries, generating an executable program or shar ...
. The need for name mangling arises where a language allows different entities to be named with the same
identifier An identifier is a name that identifies (that is, labels the identity of) either a unique object or a unique ''class'' of objects, where the "object" or class may be an idea, person, physical countable object (or class thereof), or physical mass ...
as long as they occupy a different
namespace In computing, a namespace is a set of signs (''names'') that are used to identify and refer to objects of various kinds. A namespace ensures that all of a given set of objects have unique names so that they can be easily identified. Namespaces ...
(typically defined by a module, class, or explicit ''namespace'' directive) or have different
type signature In computer science, a type signature or type annotation defines the inputs and outputs of a function, subroutine or method. A type signature includes the number, types, and order of the function's arguments. One important use of a type sign ...
s (such as in
function overloading In some programming languages, function overloading or method overloading is the ability to create multiple functions of the same name with different implementations. Calls to an overloaded function will run a specific implementation of that f ...
). It is required in these uses because each signature might require different, specialized
calling convention In computer science, a calling convention is an implementation-level (low-level) scheme for how subroutines or functions receive parameters from their caller and how they return a result. When some code calls a function, design choices have been ...
in the
machine code In computer programming, machine code is computer code consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). For conventional binary computers, machine code is the binaryOn nonb ...
. Any
object code In computing, object code or object module is the product of an assembler or compiler In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' ...
produced by compilers is usually linked with other pieces of object code (produced by the same or another compiler) by a type of program called a
linker Linker or linkers may refer to: Computing * Linker (computing), a computer program that takes one or more object files generated by a compiler or generated by an assembler and links them with libraries, generating an executable program or shar ...
. The linker needs a great deal of information on each program entity. For example, to correctly link a function it needs its name, the number of
arguments An argument is a series of sentences, statements, or propositions some of which are called premises and one is the conclusion. The purpose of an argument is to give reasons for one's conclusion via justification, explanation, and/or persua ...
and their types, and so on. The simple programming languages of the 1970s, like C, only distinguished
subroutine In computer programming, a function (also procedure, method, subroutine, routine, or subprogram) is a callable unit of software logic that has a well-defined interface and behavior and can be invoked multiple times. Callable units provide a ...
s by their name, ignoring other information including
parameter A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...
and return types. Later languages, like C++, defined stricter requirements for routines to be considered "equal", such as the parameter types, return type, and calling convention of a function. These requirements enable method overloading and detection of some bugs (such as using different definitions of a function when compiling different
source code In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer. Since a computer, at base, only ...
files). These stricter requirements needed to work with extant
programming tool A programming tool or software development tool is a computer program that is used to develop another computer program, usually by helping the developer manage computer files. For example, a programmer may use a tool called a source code editor ...
s and conventions. Thus, added requirements were encoded in the name of the symbol, since that was the only information a traditional linker had about a symbol.


Examples


C

Although name mangling is not generally required or used by languages that do not support
function overloading In some programming languages, function overloading or method overloading is the ability to create multiple functions of the same name with different implementations. Calls to an overloaded function will run a specific implementation of that f ...
, like C and classic Pascal, they use it in some cases to provide added information about a function. For example, compilers targeted at
Microsoft Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
platforms support a variety of
calling convention In computer science, a calling convention is an implementation-level (low-level) scheme for how subroutines or functions receive parameters from their caller and how they return a result. When some code calls a function, design choices have been ...
s, which determine the manner in which parameters are sent to subroutines and results are returned. Because the different calling conventions are incompatible with one another, compilers mangle symbols with codes detailing which convention should be used to call the specific routine. The mangling scheme for Windows was established by Microsoft and has been informally followed by other compilers including
Digital Mars Digital Mars is an American computer software company founded by Walter Bright and based in Vienna, Virginia. It makes C, C++, and D compilers, and associated utilities such as an integrated development environment (IDE) for Windows and DO ...
,
Borland Borland Software Corporation was a computing technology company founded in 1983 by Niels Jensen, Ole Henriksen, Mogens Glad, and Philippe Kahn. Its main business was developing and selling software development and software deployment products. B ...
, and
GNU Compiler Collection The GNU Compiler Collection (GCC) is a collection of compilers from the GNU Project that support various programming languages, Computer architecture, hardware architectures, and operating systems. The Free Software Foundation (FSF) distributes ...
(GCC) when compiling code for the Windows platforms. The scheme even applies to other languages, such as Pascal, D,
Delphi Delphi (; ), in legend previously called Pytho (Πυθώ), was an ancient sacred precinct and the seat of Pythia, the major oracle who was consulted about important decisions throughout the ancient Classical antiquity, classical world. The A ...
, Fortran, and C#. This allows subroutines written in those languages to call, or be called by, extant Windows
libraries A library is a collection of Book, books, and possibly other Document, materials and Media (communication), media, that is accessible for use by its members and members of allied institutions. Libraries provide physical (hard copies) or electron ...
using a calling convention different from their default. When compiling the following C examples: int _cdecl f (int x) int _stdcall g (int y) int _fastcall h (int z) 32-bit compilers emit, respectively: _f _g@4 @h@4 In the and mangling schemes, the function is encoded as _@ and @@ respectively, where is the number of bytes, in decimal, of the argument(s) in the parameter list (including those passed in registers, for fastcall). In the case of , the function name is merely prefixed by an underscore. The 64-bit convention on Windows (Microsoft C) has no leading underscore. This difference may in some rare cases lead to unresolved externals when porting such code to 64 bits. For example, Fortran code can use 'alias' to link against a C method by name as follows: SUBROUTINE f() !DEC$ ATTRIBUTES C, ALIAS:'_f' :: f END SUBROUTINE This will compile and link fine under 32 bits, but generate an unresolved external _f under 64 bits. One workaround for this is not to use 'alias' at all (in which the method names typically need to be capitalized in C and Fortran). Another is to use the BIND option: SUBROUTINE f() BIND(C,NAME="f") END SUBROUTINE In C, most compilers also mangle static functions and variables (and in C++ functions and variables declared static or put in the anonymous namespace) in translation units using the same mangling rules as for their non-static versions. If functions with the same name (and parameters for C++) are also defined and used in different translation units, it will also mangle to the same name, potentially leading to a clash. However, they will not be equivalent if they are called in their respective translation units. Compilers are usually free to emit arbitrary mangling for these functions, because it is illegal to access these from other translation units directly, so they will never need linking between different object code (linking of them is never needed). To prevent linking conflicts, compilers will use standard mangling, but will use so-called 'local' symbols. When linking many such translation units there might be multiple definitions of a function with the same name, but resulting code will only call one or another depending on which translation unit it came from. This is usually done using the relocation mechanism.


C++

C++ compilers are the most widespread users of name mangling. The first C++ compilers were implemented as translators to C source code, which would then be compiled by a C compiler to object code; because of this, symbol names had to conform to C identifier rules. Even later, with the emergence of compilers that produced machine code or assembly directly, the system's
linker Linker or linkers may refer to: Computing * Linker (computing), a computer program that takes one or more object files generated by a compiler or generated by an assembler and links them with libraries, generating an executable program or shar ...
generally did not support C++ symbols, and mangling was still required. The C++ language does not define a standard decoration scheme, so each compiler uses its own. C++ also has complex language features, such as classes, templates,
namespaces In computing, a namespace is a set of signs (''names'') that are used to identify and refer to objects of various kinds. A namespace ensures that all of a given set of objects have unique names so that they can be easily identified. Namespaces ...
, and
operator overloading In computer programming, operator overloading, sometimes termed ''operator ad hoc polymorphism'', is a specific case of polymorphism, where different operators have different implementations depending on their arguments. Operator overloading ...
, that alter the meaning of specific symbols based on context or usage. Meta-data about these features can be disambiguated by mangling (decorating) the name of a
symbol A symbol is a mark, Sign (semiotics), sign, or word that indicates, signifies, or is understood as representing an idea, physical object, object, or wikt:relationship, relationship. Symbols allow people to go beyond what is known or seen by cr ...
. Because the name-mangling systems for such features are not standardized across compilers, few linkers can link object code that was produced by different compilers.


Simple example

A single C++ translation unit might define two functions named : int f () int f (int) void g () These are distinct functions, with no relation to each other apart from the name. The C++ compiler will therefore encode the type information in the symbol name, the result being something resembling: int __f_v () int __f_i (int) void __g_v () Even though its name is unique, is still mangled: name mangling applies to ''all'' C++ symbols (except for those in an extern "C" block).


Complex example

The mangled symbols in this example, in the comments below the respective identifier name, are those produced by the GNU GCC 3.x compilers, according to the
IA-64 IA-64 (Intel Itanium architecture) is the instruction set architecture (ISA) of the discontinued Itanium family of 64-bit Intel microprocessors. The basic ISA specification originated at Hewlett-Packard (HP), and was subsequently implemented by ...
(Itanium) ABI: namespace wikipedia All mangled symbols begin with (note that an identifier beginning with an underscore followed by a capital letter is a reserved identifier in C, so conflict with user identifiers is avoided); for nested names (including both namespaces and classes), this is followed by , then a series of <length, id> pairs (the length being the length of the next identifier), and finally . For example, becomes: _ZN9wikipedia7article6formatE For functions, this is then followed by the type information; as is a function, this is simply ; hence: _ZN9wikipedia7article6formatEv For , the standard type (which is a for ) is used, which has the special alias ; a reference to this type is therefore , with the complete name for the function being: _ZN9wikipedia7article8print_toERSo


How different compilers mangle the same functions

There isn't a standardized scheme by which even trivial C++ identifiers are mangled, and consequently different compilers (or even different versions of the same compiler, or the same compiler on different platforms) mangle public symbols in radically different (and thus totally incompatible) ways. Consider how different C++ compilers mangle the same functions: Notes: *The
Compaq Compaq Computer Corporation was an American information technology, information technology company founded in 1982 that developed, sold, and supported computers and related products and services. Compaq produced some of the first IBM PC compati ...
C++ compiler on
OpenVMS OpenVMS, often referred to as just VMS, is a multi-user, multiprocessing and virtual memory-based operating system. It is designed to support time-sharing, batch processing, transaction processing and workstation applications. Customers using Op ...
VAX and
Alpha Alpha (uppercase , lowercase ) is the first letter of the Greek alphabet. In the system of Greek numerals, it has a value of one. Alpha is derived from the Phoenician letter ''aleph'' , whose name comes from the West Semitic word for ' ...
(but not IA-64) and
Tru64 UNIX Tru64 UNIX is a discontinued 64-bit UNIX operating system for the DEC Alpha, Alpha instruction set architecture (ISA), currently owned by Hewlett-Packard (HP). Previously, Tru64 UNIX was a product of Compaq, and before that, Digital Equipment Corp ...
has two name mangling schemes. The original, pre-standard scheme is known as the ARM model, and is based on the name mangling described in the C++ Annotated Reference Manual (ARM). With the advent of new features in standard C++, particularly templates, the ARM scheme became more and more unsuitable – it could not encode certain function types, or produced identically mangled names for different functions. It was therefore replaced by the newer ''
American National Standards Institute The American National Standards Institute (ANSI ) is a private nonprofit organization that oversees the development of voluntary consensus standards for products, services, processes, systems, and personnel in the United States. The organiz ...
'' (ANSI) model, which supported all ANSI template features, but was not backward compatible. *On IA-64, a standard
application binary interface An application binary interface (ABI) is an interface exposed by software that is defined for in-process machine code access. Often, the exposing software is a library, and the consumer is a program. An ABI is at a relatively low-level of a ...
(ABI) exists (see
external links An internal link is a type of hyperlink on a web page to another page or resource, such as an image or document, on the same website or domain. It is the opposite of an external link, a link that directs a user to content that is outside its d ...
), which defines (among other things) a standard name-mangling scheme, and which is used by all the IA-64 compilers. GNU GCC 3.''x'' has further adopted the name mangling scheme defined in this standard for use on other, non-Intel platforms. *The
Visual Studio Visual Studio is an integrated development environment (IDE) developed by Microsoft. It is used to develop computer programs including web site, websites, web apps, web services and mobile apps. Visual Studio uses Microsoft software development ...
and Windows SDK include the program which prints the C-style function prototype for a given mangled name. *On Microsoft Windows, the Intel compiler and
Clang Clang () is a compiler front end for the programming languages C, C++, Objective-C, Objective-C++, and the software frameworks OpenMP, OpenCL, RenderScript, CUDA, SYCL, and HIP. It acts as a drop-in replacement for the GNU Compiler ...
uses the Visual C++ name mangling for compatibility.


Handling of C symbols when linking from C++

The job of the common C++ idiom: #ifdef __cplusplus extern "C" #endif is to ensure that the symbols within are "unmangled" – that the compiler emits a binary file with their names undecorated, as a C compiler would do. As C language definitions are unmangled, the C++ compiler needs to avoid mangling references to these identifiers. For example, the standard strings library, , usually contains something resembling: #ifdef __cplusplus extern "C" #endif Thus, code such as: if (strcmp(argv "-x")

0) strcpy(a, argv ; else memset (a, 0, sizeof(a));
uses the correct, unmangled and . If the had not been used, the (SunPro) C++ compiler would produce code equivalent to: if (__1cGstrcmp6Fpkc1_i_(argv "-x")

0) __1cGstrcpy6Fpcpkc_0_(a, argv ; else __1cGmemset6FpviI_0_ (a, 0, sizeof(a));
Since those symbols do not exist in the C runtime library (''e.g.'' libc), link errors would result.


Standardized name mangling in C++

It would seem that standardized name mangling in the C++ language would lead to greater interoperability between compiler implementations. However, such a standardization by itself would not suffice to guarantee C++ compiler interoperability and it might even create a false impression that interoperability is possible and safe when it isn't. Name mangling is only one of several
application binary interface An application binary interface (ABI) is an interface exposed by software that is defined for in-process machine code access. Often, the exposing software is a library, and the consumer is a program. An ABI is at a relatively low-level of a ...
(ABI) details that need to be decided and observed by a C++ implementation. Other ABI aspects like
exception handling In computing and computer programming, exception handling is the process of responding to the occurrence of ''exceptions'' – anomalous or exceptional conditions requiring special processing – during the execution of a program. In general, an ...
,
virtual table In computer programming, a virtual method table (VMT), virtual function table, virtual call table, dispatch table, vtable, or vftable is a mechanism used in a programming language to support dynamic dispatch (or run-time method binding). Wh ...
layout, structure, and stack frame padding also cause differing C++ implementations to be incompatible. Further, requiring a particular form of mangling would cause issues for systems where implementation limits (e.g., length of symbols) dictate a particular mangling scheme. A standardized ''requirement'' for name mangling would also prevent an implementation where mangling was not required at all – for example, a linker that understood the C++ language. The
C++ standard C standard may refer to: * ANSI C, C99, C11, C17, or C23, specifications of the C programming language * C standard library The C standard library, sometimes referred to as libc, is the standard library for the C (programming language), C pr ...
therefore does not attempt to standardize name mangling. On the contrary, the ''Annotated C++ Reference Manual'' (also known as ''ARM'', , section 7.2.1c) actively encourages the use of different mangling schemes to prevent linking when other aspects of the ABI are incompatible. Nevertheless, as detailed in the section above, on some platforms the full C++ ABI has been standardized, including name mangling.


Real-world effects of C++ name mangling

Because C++ symbols are routinely exported from DLL and shared object files, the name mangling scheme is not merely a compiler-internal matter. Different compilers (or different versions of the same compiler, in many cases) produce such binaries under different name decoration schemes, meaning that symbols are frequently unresolved if the compilers used to create the library and the program using it employed different schemes. For example, if a system with multiple C++ compilers installed (e.g., GNU GCC and the OS vendor's compiler) wished to install the
Boost C++ Libraries Boost, boosted or boosting may refer to: Science, technology and mathematics * Boost, positive manifold pressure in turbocharged engines * Boost (C++ libraries), a set of free peer-reviewed portable C++ libraries * Boost (material), a material b ...
, it would have to be compiled multiple times (once for GCC and once for the vendor compiler). It is good for safety purposes that compilers producing incompatible object codes (codes based on different ABIs, regarding e.g., classes and exceptions) use different name mangling schemes. This guarantees that these incompatibilities are detected at the linking phase, not when executing the software (which could lead to obscure bugs and serious stability issues). For this reason, name decoration is an important aspect of any C++-related ABI. There are instances, particularly in large, complex code bases, where it can be difficult or impractical to map the mangled name emitted within a linker error message back to the particular corresponding token/variable-name in the source. This problem can make identifying the relevant source file(s) very difficult for build or test engineers even if only one compiler and linker are in use. Demanglers (including those within the linker error reporting mechanisms) sometimes help but the mangling mechanism itself may discard critical disambiguating information.


Demangle via c++filt

$ c++filt -n _ZNK3MapI10StringName3RefI8GDScriptE10ComparatorIS0_E16DefaultAllocatorE3hasERKS0_ Map, Comparator, DefaultAllocator>::has(StringName const&) const


Demangle via builtin GCC ABI

#include #include #include int main() Output:


Java

In
Java Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
, the ''signature'' of a method or a class contains its name and the types of its method arguments and return value, where applicable. The format of signatures is documented, as the language, compiler, and .class file format were all designed together (and had object-orientation and universal interoperability in mind from the start).


Creating unique names for inner and anonymous classes

The scope of anonymous classes is confined to their parent class, so the compiler must produce a "qualified" public name for the
inner class In object-oriented programming (OOP), an inner class or nested class is a class declared entirely within the body of another class or interface. It is distinguished from a subclass. Overview An instance of a normal or top-level class can exist on ...
, to avoid conflict where other classes with the same name (inner or not) exist in the same namespace. Similarly, anonymous classes must have "fake" public names generated for them (as the concept of anonymous classes only exists in the compiler, not the runtime). So, compiling the following Java program: public class foo will produce three .class files: * foo.class, containing the main (outer) class ''foo'' * foo$bar.class, containing the named inner class ''foo.bar'' * foo$1.class, containing the anonymous inner class (local to method ''foo.zark'') All of these class names are valid (as $ symbols are permitted in the JVM specification) and these names are "safe" for the compiler to generate, as the Java language definition advises not to use $ symbols in normal java class definitions. Name resolution in Java is further complicated at runtime, as fully qualified names for classes are unique only inside a specific classloader instance. Classloaders are ordered hierarchically and each Thread in the JVM has a so-called context class loader, so in cases where two different classloader instances contain classes with the same name, the system first tries to load the class using the root (or system) classloader and then goes down the hierarchy to the context class loader.


Java Native Interface

Java Native Interface The Java Native Interface (JNI) is a foreign function interface programming framework that enables Java code running in a Java virtual machine (JVM) to call and be called by native applications (programs specific to a hardware and operating s ...
, Java's native method support, allows Java language programs to call out to programs written in another language (usually C or C++). There are two name-resolution concerns here, neither of which is implemented in a
standardized Standardization (American English) or standardisation (British English) is the process of implementing and developing technical standards based on the consensus of different parties that include firms, users, interest groups, standards organiza ...
manner: * JVM to native name translation - this seems to be more stable, since Oracle makes its scheme public. * Normal C++ name mangling - see above.


Python

In Python, mangling is used for class attributes that one does not want subclasses to use which are designated as such by giving them a name with two or more leading underscores and no more than one trailing underscore. For example, will be mangled, as will and , but and will not. Python's runtime does not restrict access to such attributes, the mangling only prevents name collisions if a derived class defines an attribute with the same name. On encountering name mangled attributes, Python transforms these names by prepending a single underscore and the name of the enclosing class, for example: >>> class Test: ... def __mangled_name(self): ... pass ... def normal_name(self): ... pass >>> t = Test() >>> ttr for attr in dir(t) if "name" in attr _Test__mangled_name', 'normal_name'


Pascal


Turbo Pascal, Delphi

To avoid name mangling in Pascal, use: exports myFunc name 'myFunc', myProc name 'myProc';


Free Pascal

Free Pascal Free Pascal Compiler (FPC) is a compiler for the closely related programming-language dialects Pascal and Object Pascal. It is free software released under the GNU General Public License, witexception clausesthat allow static linking against it ...
supports function and operator overloading, thus it also uses name mangling to support these features. On the other hand, Free Pascal is capable of calling symbols defined in external modules created with another language and exporting its own symbols to be called by another language. For further information, consul
Chapter 6.2
an

o


Fortran

Name mangling is also necessary in Fortran compilers, originally because the language is case insensitive. Further mangling requirements were imposed later in the evolution of the language because of the addition of modules and other features in the Fortran 90 standard. The case mangling, especially, is a common issue that must be dealt with to call Fortran libraries, such as LAPACK, from other languages, such as C. Because of the case insensitivity, the name of a subroutine or function must be converted to a standardized case and format by the compiler so that it will be linked in the same way regardless of case. Different compilers have implemented this in various ways, and no standardization has occurred. The AIX and
HP-UX HP-UX (from "Hewlett Packard Unix") is a proprietary software, proprietary implementation of the Unix operating system developed by Hewlett Packard Enterprise; current versions support HPE Integrity Servers, based on Intel's Itanium architect ...
Fortran compilers convert all identifiers to lower case , while the
Cray Cray Inc., a subsidiary of Hewlett Packard Enterprise, is an American supercomputer manufacturer headquartered in Seattle, Washington. It also manufactures systems for data storage and analytics. Several Cray supercomputer systems are listed ...
and Unicos Fortran compilers converted identifiers to all upper case . The GNU g77 compiler converts identifiers to lower case plus an underscore , except that identifiers already containing an underscore have two underscores appended , following a convention established by f2c. Many other compilers, including
Silicon Graphics Silicon Graphics, Inc. (stylized as SiliconGraphics before 1999, later rebranded SGI, historically known as Silicon Graphics Computer Systems or SGCS) was an American high-performance computing manufacturer, producing computer hardware and soft ...
's (SGI)
IRIX IRIX (, ) is a discontinued operating system developed by Silicon Graphics (SGI) to run on the company's proprietary MIPS architecture, MIPS workstations and servers. It is based on UNIX System V with Berkeley Software Distribution, BSD extensio ...
compilers, GNU Fortran, and
Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, and Delaware General Corporation Law, incorporated in Delaware. Intel designs, manufactures, and sells computer compo ...
's Fortran compiler (except on Microsoft Windows), convert all identifiers to lower case plus an underscore ( and , respectively). On Microsoft Windows, the Intel Fortran compiler defaults to uppercase without an underscore. Identifiers in Fortran 90 modules must be further mangled, because the same procedure name may occur in different modules. Since the Fortran 2003 Standard requires that module procedure names not conflict with other external symbols, compilers tend to use the module name and the procedure name, with a distinct marker in between. For example: module m contains integer function five() five = 5 end function five end module m In this module, the name of the function will be mangled as (e.g., GNU Fortran), (e.g., Intel's ifort), (e.g., Oracle's sun95), etc. Since Fortran does not allow overloading the name of a procedure, but uses generic interface blocks and generic type-bound procedures instead, the mangled names do not need to incorporate clues about the arguments. The Fortran 2003 BIND option overrides any name mangling done by the compiler, as shown above.


Rust

Function names are mangled by default in
Rust Rust is an iron oxide, a usually reddish-brown oxide formed by the reaction of iron and oxygen in the catalytic presence of water or air moisture. Rust consists of hydrous iron(III) oxides (Fe2O3·nH2O) and iron(III) oxide-hydroxide (FeO(OH) ...
. However, this can be disabled by the function attribute. This attribute can be used to export functions to C, C++, or
Objective-C Objective-C is a high-level general-purpose, object-oriented programming language that adds Smalltalk-style message passing (messaging) to the C programming language. Originally developed by Brad Cox and Tom Love in the early 1980s, it was ...
. Further, along with the function attribute or the crate attribute, it allows the user to define a C-style entry point for the program. Rust has used many versions of symbol mangling schemes that can be selected at compile time with an option. The following manglers are defined: * A C++ style mangling based on the Itanium IA-64 C++ ABI. Symbols begin with , and filename hashes are used for disambiguation. Used since Rust 1.9. * An improved version of the legacy scheme, with changes for Rust. Symbols begin with . Polymorphism can be encoded. Functions don't have return types encoded (Rust does not have overloading). Unicode names use modified punycode. Compression (backreference) use byte-based addressing. Used since Rust 1.37. Examples are provided in the Rust tests.


Objective-C

Essentially two forms of method exist in
Objective-C Objective-C is a high-level general-purpose, object-oriented programming language that adds Smalltalk-style message passing (messaging) to the C programming language. Originally developed by Brad Cox and Tom Love in the early 1980s, it was ...
, the class ("static") method, and the
instance method A method in object-oriented programming (OOP) is a Procedure (computer science), procedure associated with an Object (computer science), object, and generally also a Message passing, message. An object consists of ''state data'' and ''behavior''; ...
. A method declaration in Objective-C is of the following form: + (''return-type'') ''name''0:''parameter''0 ''name''1:''parameter''1 ... – (''return-type'') ''name''0:''parameter''0 ''name''1:''parameter''1 ... Class methods are signified by +, instance methods use -. A typical class method declaration may then look like: + (id) initWithX: (int) number andY: (int) number; + (id) new; With instance methods looking like this: - (id) value; - (id) setValue: (id) new_value; Each of these method declarations have a specific internal representation. When compiled, each method is named according to the following scheme for class methods: _c_''Class''_''name''0_''name''1_ ... and this for instance methods: _i_''Class''_''name''0_''name''1_ ... The colons in the Objective-C syntax are translated to underscores. So, the Objective-C class method , if belonging to the class would translate as , and the instance method (belonging to the same class) would translate to . Each of the methods of a class are labeled in this way. However, to look up a method that a class may respond to would be tedious if all methods are represented in this fashion. Each of the methods is assigned a unique symbol (such as an integer). Such a symbol is known as a ''selector''. In Objective-C, one can manage selectors directly – they have a specific type in Objective-C – . During compiling, a table is built that maps the textual representation, such as , to selectors (which are given a type ). Managing selectors is more efficient than manipulating the text representation of a method. Note that a selector only matches a method's name, not the class it belongs to: different classes can have different implementations of a method with the same name. Because of this, implementations of a method are given a specific identifier too, these are known as implementation pointers, and are also given a type, . Message sends are encoded by the compiler as calls to the function, or one of its cousins, where is the receiver of the message, and determines the method to call. Each class has its own table that maps selectors to their implementations – the implementation pointer specifies where in memory the implementation of the method resides. There are separate tables for class and instance methods. Apart from being stored in the to lookup tables, the functions are essentially anonymous. The value for a selector does not vary between classes. This enables polymorphism. The Objective-C runtime maintains information about the argument and return types of methods. However, this information is not part of the name of the method, and can vary from class to class. Since Objective-C does not support
namespaces In computing, a namespace is a set of signs (''names'') that are used to identify and refer to objects of various kinds. A namespace ensures that all of a given set of objects have unique names so that they can be easily identified. Namespaces ...
, there is no need for the mangling of class names (that do appear as symbols in generated binaries).


Swift

Swift Swift or SWIFT most commonly refers to: * SWIFT, an international organization facilitating transactions between banks ** SWIFT code * Swift (programming language) * Swift (bird), a family of birds It may also refer to: Organizations * SWIF ...
keeps metadata about functions (and more) in the mangled symbols referring to them. This metadata includes the function's name, attributes, module name, parameter types, return type, and more. For example: The mangled name for a method of a class in module is , for 2014 Swift. The components and their meanings are as follows: * : The prefix for all Swift symbols. Everything will start with this. * : Non-curried function. * : Function of a class, i.e. a method * : Module name, prefixed with its length. * : Name of class the function belongs to, prefixed with its length. * : Function name, prefixed with its length. * : The function attribute. In this case ‘f’, which means a normal function. * : Designates the type of the first parameter (namely the class instance) as the first in the type stack (here is not nested and thus has index 0). * : This begins the type list for the parameter tuple of the function. * : External name of first parameter of the function. * : Indicates builtin Swift type Swift.Int for the first parameter. * : The return type: again Swift.Int. Mangling for versions since Swift 4.0 is documented officially. It retains some similarity to Itanium.


See also

*
Application programming interface An application programming interface (API) is a connection between computers or between computer programs. It is a type of software Interface (computing), interface, offering a service to other pieces of software. A document or standard that des ...
(API) *
Application binary interface An application binary interface (ABI) is an interface exposed by software that is defined for in-process machine code access. Often, the exposing software is a library, and the consumer is a program. An ABI is at a relatively low-level of a ...
(ABI) *
Calling convention In computer science, a calling convention is an implementation-level (low-level) scheme for how subroutines or functions receive parameters from their caller and how they return a result. When some code calls a function, design choices have been ...
*
Comparison of application virtualization software Application virtualization software refers to both application virtual machines and software responsible for implementing them. Application virtual machines are typically used to allow application bytecode to run portably on many different comput ...
(i.e. VMs) *
Foreign function interface A foreign function interface (FFI) is a mechanism by which a program written in one programming language can call routines or make use of services written or compiled in another one. An FFI is often used in contexts where calls are made into a bin ...
(FFI) *
Java Native Interface The Java Native Interface (JNI) is a foreign function interface programming framework that enables Java code running in a Java virtual machine (JVM) to call and be called by native applications (programs specific to a hardware and operating s ...
(JNI) *
Language binding In programming and software design, a binding is an application programming interface (API) that provides glue code specifically made to allow a programming language to use a foreign library or operating system service (one that is not native to ...
* Stropping * SWIG


References


External links


Linux Itanium ABI for C++
including name mangling scheme.
Macintosh C/C++ ABI Standard Specification


– filter to demangle encoded C++ symbols for GNU/Intel compilers
undname
– msvc tool to demangle names.
demangler.com
– An online tool for demangling GCC and MSVC C++ symbols

– From Apple's

'
Calling conventions for different C++ compilers
by Agner Fog contains detailed description of name mangling schemes for various x86 and x64 C++ compilers (pp. 24–42 in 2011-06-08 version)
C++ Name Mangling/Demangling
Quite detailed explanation of Visual C++ compiler name mangling scheme
PHP UnDecorateSymbolName
a php script that demangles Microsoft Visual C's function names.

* Code

ftp://ftp.iecc.com/pub/linker/] Errata
https://archive.today/20200114224817/https://linker.iecc.com/ 2020-01-14 -->

Name mangling demystified by Fivos Kefallonitis
{{Application binary interface C++ Computer libraries Java (programming language) Compiler construction Articles with example Java code Articles with example Python (programming language) code Articles with example C++ code Articles with example C code