Fact-checked by Grok 2 weeks ago

Name mangling

Name mangling, also known as name decoration, is a technique used by compilers to modify the names of functions, variables, classes, and other programming entities by encoding additional —such as types, types, namespaces, and calling conventions—into the names within object files. This process ensures uniqueness for linker resolution, preventing conflicts arising from language features like , templates, and scoped identifiers, particularly in compiled languages that produce native binaries. In C++, name mangling is essential for supporting the language's advanced features while maintaining compatibility with C-style linkers. Compilers like Microsoft Visual C++ and (following the ABI) generate decorated names that incorporate the full ; for instance, a simple int add(int, int) might be mangled to something like ?add@@YAHHH@Z in MSVC or _Z3addii in , allowing the linker to distinguish it from overloaded variants. The C++ ABI standardizes this for many platforms, using a hierarchical encoding scheme with substitutions for repeated elements to create compact, portable mangled names that begin with _Z for C++ entities. This approach not only resolves ambiguities during linking but also facilitates separate compilation and binary compatibility across modules. While most prominent in C++, name mangling appears in varied forms in other languages to address similar issues of name resolution and encapsulation. In , it is a source-level applied to attributes starting with double underscores (e.g., __private), transforming them to _ClassName__private to avoid accidental overrides in subclasses, though this is more about convention than binary linking. Languages like and employ analogous techniques for , such as encoding signatures in JVM or avoiding mangling for C ABI compatibility, but these are often tailored to or platform-specific needs rather than native linkers. Overall, name mangling underscores the challenges of evolving programming languages while preserving low-level efficiency and .

Fundamentals

Definition

Name mangling is a technique that systematically alters the names of symbols—such as functions, variables, and classes—in the generated by encoding supplementary information, including parameter types, return types, namespaces, and scope qualifiers, to produce unique identifiers for linkage purposes. This encoding ensures that entities with identical base names but differing signatures or contexts can be distinguished during the linking phase, preventing resolution conflicts in the final executable. The resulting mangled names serve as the binary representation of these symbols within object files, adhering to conventions defined by the language's (ABI). The scope of name mangling is centered on languages that incorporate advanced naming features like function overloading, where multiple functions may share the same name but differ in arguments; namespaces, which organize code hierarchically; and nested scopes, which introduce local name bindings. Unlike simpler name decoration in low-level assembly, where modifications might involve only basic prefixes (e.g., underscores for global symbols) or suffixes for visibility, name mangling embeds richer semantic details to support complex type systems and modular compilation. Key terminology includes symbols, the encoded identifiers stored in object files for reference by the linker; linkage, the mechanism resolving external references across compilation units; and the ABI, a contractual specification dictating mangling rules to promote interoperability between compilers, libraries, and tools. A representative example illustrates the transformation: an unmangled function declaration like foo(int) might be mangled to _Z3fooi, where the prefix and encoded components distinguish it from other overloads or scoped variants. This process is integral to maintaining the integrity of resolution without altering the source-level .

Purpose

Name mangling serves as a critical mechanism in compiled languages to resolve ambiguities arising from features like , where multiple functions share the same name but differ in parameters, such as foo(int) and foo(double). By encoding additional information into symbol names, it ensures that the linker can distinguish these entities uniquely, preventing incorrect resolutions during the linking phase. Beyond overloading, name mangling supports advanced language constructs such as and , which introduce scoped or parameterized identifiers that would otherwise collide in the flat global namespace of object files. For instance, a bar in std or a vec<int> receives a mangled name incorporating scope and type details, enabling the linker to match references across separate translation units without name clashes. This facilitates polymorphism and generics by preserving type information at the binary level. The primary benefits include reliable linking across modular codebases, where separate compilation units can be developed independently yet integrated seamlessly, and maintenance of binary compatibility in shared libraries. Without mangling, object files' simplistic naming—where symbols like "print" from different modules would overwrite each other—would render large-scale software development impractical. It thus addresses the inherent limitations of traditional linkers designed for simpler languages like C, which lack such scoping and overloading. Historically, name mangling emerged in the 1980s alongside the development of object-oriented extensions to C, particularly through the Cfront compiler released in 1983, which translated C++ to C code while encoding names to achieve type-safe linkage beyond C's capabilities. This innovation was driven by the need to handle increasingly complex programs with features unavailable in C's simple external linkage model, paving the way for modern C++'s modular and extensible design.

General Principles

Name mangling in compilers adheres to core principles that ensure reliable symbol resolution across compilation units and tools. Primarily, mangling must be deterministic, generating identical encoded names for the same declaration regardless of compilation order or environmental variations, which supports language rules like the (ODR) in C++ by preventing duplicate definitions from conflicting. This determinism extends to handling complex features: templates encode parameter lists and substitutions to uniquely identify instantiations; virtual functions incorporate vtable offsets and overriding information; and constructors/destructors receive distinct codes (e.g., "C1" for complete constructors, "D1" for complete destructors) to differentiate their behaviors during linking and runtime. Additionally, mangled names are designed to be reversible through demangling algorithms, enabling debuggers and tools to reconstruct source-level names for improved . Overall, these encodings must comply with the (ABI), standardizing symbol formats for interoperability between compilers, libraries, and platforms. Mangling also accounts for scope and visibility to maintain symbol uniqueness without namespace pollution. Local symbols, confined to a single translation unit, often receive minimal or no additional encoding beyond internal identifiers, while global symbols with external linkage incorporate full qualification, such as namespace or class prefixes (e.g., "N" for nested names followed by scope components). This differentiation prevents collisions between symbols sharing names but differing in linkage—internal linkage omits exportable details to avoid unnecessary exposure, whereas external linkage embeds comprehensive context to resolve references across files. By embedding scope hierarchies (e.g., using "E" to terminate nested-name sequences), mangling preserves the visibility semantics of the source language, ensuring the linker can correctly bind calls and accesses. The process presupposes foundational compilation stages, including to tokenize identifiers and to build abstract syntax trees that resolve scopes and types. It draws on the to derive signatures for overloaded entities, capturing parameter types, return values, and qualifiers essential for disambiguation. Mangling occurs post-parsing, typically during generation or code emission, after semantic checks confirm validity. In the general workflow, the starts with a source identifier (e.g., a function name), extracts its signature and contextual (e.g., enclosing scopes, arguments), applies ABI-specific encoding rules—often a grammar-based transformation yielding prefixed strings—and outputs the mangled symbol to the object file's relocation and symbol tables for linker consumption. Substitutions may compress repetitive elements, like repeated types, to optimize length while preserving information. Versioning poses significant pitfalls, as compiler updates can introduce ABI breaks by altering mangling schemes, such as changes to type encodings or standard compliance, leading to incompatible object files that fail to link or execute correctly. For example, evolving C++ standards may necessitate revisions to or RTTI handling, requiring explicit ABI versioning (e.g., via flags like -fabi-version in ) to maintain ; mismatches between library and compiler ABIs often result in unresolved symbols or crashes. Platforms mitigate this through documented policies, but inadvertent breaks from untracked changes underscore the need for stable ABI contracts in production environments.

Techniques

Encoding Methods

Name mangling employs various encoding strategies to transform original identifiers into unique symbols that incorporate additional metadata, such as calling conventions or type signatures, ensuring linker resolution without conflicts. One common technique is prefix and suffix decoration, which appends or prepends concise indicators to the base name. For instance, in the __stdcall calling convention, the compiler prefixes an underscore to the function name and suffixes it with an @ followed by the total byte size of the parameters, resulting in a mangled name like _foo@8 for a function foo with two 4-byte integer parameters. This approach provides basic disambiguation for calling conventions while maintaining relative compactness. A more comprehensive method involves full signature encoding, where the mangled name embeds the complete signature, including type, parameter types, and . The Itanium C++ ABI exemplifies this by prefixing all mangled names with _Z, followed by the encoded name (prefixed by its length in digits), and then type codes for the parameters. types are not included for non-template functions. For example, a void foo(int) mangles to _Z3fooi, where 3 denotes the length of "foo" and i codes for the int parameter; the void return is implicit. Notably, while MSVC includes types in its mangling (e.g., via codes like 'H' for ), the Itanium ABI omits them for non-template functions to avoid unnecessary distinctions, as C++ does not support overloading solely on type. In the Itanium ABI, the parameter types follow the name encoding directly, with no explicit or type for non-template functions. Other ABIs, such as MSVC, prepend type encodings before parameters. Encoding components typically include single-character type codes—such as i for int, d for double, and v for void—along with as digits, and qualifiers for namespaces using nested structures like N for entry and E for exit (e.g., _ZN1A3BEE for A::B). Standards like the ABI prioritize detailed encoding to support features such as and namespaces, ensuring (ODR) compliance across compilations. In contrast, Microsoft Visual C++ (MSVC) uses a question-mark prefixed format for C++ functions, such as ?foo@@YAHH@Z for int foo(int), which encodes the return type (YA H for int), parameters (H for int), and (@Z for __cdecl). These ABIs balance informativeness against bloat; full signatures enable precise overloading resolution but produce longer names, while undecorated exports in MSVC allow simpler linkage for C-compatible interfaces at the cost of reduced . Variations in mangling range from simple decorations, suitable for languages without overloading, to complex schemes that handle templates and exceptions. Vendor extensions may introduce custom codes for architecture-specific features, and non-ASCII characters are typically encoded using escape sequences to maintain portability. A basic encoding function might prepend an to the base name and append type codes based on parameters, as illustrated in the following :
function mangle_name(base_name, params):
    mangled = "_" + base_name
    byte_size = 0
    for param in params:
        if param.type == "int":
            mangled += "@4"  # Example suffix for 4-byte int
            byte_size += 4
        # Add cases for other types
    if byte_size > 0:
        mangled += "@" + str(byte_size)
    return mangled
This simplistic approach, akin to __stdcall decoration, contrasts with full-signature methods by omitting detailed type sequences for brevity.

Demangling Processes

Demangling is the process of parsing mangled symbol names back into their original, human-readable forms from source code, guided by the specific rules defined in an Application Binary Interface (ABI). This reversal relies on the deterministic encoding schemes outlined in the ABI to reconstruct function names, parameter types, namespaces, and other qualifiers without loss of information, assuming the mangled string is complete and adheres to the expected format. General methods for demangling involve rule-based that follows the ABI's , often implemented as recursive parsers or state machines to decode nested structures like templates and scopes. Type codes and abbreviations are typically resolved through table lookups, where short symbols (e.g., 'i' for ) map to full type names, while lengths and qualifiers are extracted via numeric prefixes or escape sequences. These approaches ensure efficient decoding, with implementations like the GNU __cxa_demangle providing a portable C-based for or tool-based use. Challenges in demangling include handling ambiguous encodings where multiple source constructs could map to similar mangled forms, leading to potential non-unique reconstructions, as seen in older schemes requiring additional heuristics to disambiguate. Compiler-specific quirks arise because name mangling lacks full standardization across ABIs, resulting in incompatible formats between vendors like and MSVC, which complicates cross-compiler tooling. Partial demangling is common for incomplete signatures, such as when only a function name is available without full type information, yielding approximate rather than exact outputs and affecting about 15% of symbols in some libraries. Demangling is essential in debuggers like GDB, where commands such as demangle or settings like set print demangle on convert symbols in stack traces and breakpoints to readable forms for easier navigation of overloaded functions. It also supports stack trace generation in error reporting and symbol resolution in profilers, where demangled names in symbol tables aid performance analysis by revealing call hierarchies and type details. A representative example is demangling the Itanium ABI symbol _Z3fooi, which follows these steps based on the ABI rules: (1) Recognize the _Z prefix indicating an encoded function name; (2) Parse the digit 3 as the length of the base name, followed by foo; (3) Decode the trailing i via type table lookup as int, yielding foo(int). This breakdown highlights the sequential, prefix-driven nature of ABI-compliant demangling.

Implementations by Language

In C

In C, name mangling is not employed due to the language's lack of support for , , or other features that necessitate encoding additional information into symbol names for uniqueness. Instead, C relies on simple symbol decoration for external linkage, which varies by platform and but does not include type or parameter encoding. This approach ensures straightforward compatibility with linkers and assembly code designed for C's flat model. External symbols in C object files retain their source names, often with minimal decoration such as a leading on certain architectures to avoid conflicts with system-level identifiers. For example, and on x86-based macOS prepend an underscore, transforming a foo into _foo in the , while on x86-64, the symbol remains foo without prefix or alteration. This decoration is controlled in via the -fleading-underscore option, which forces the prefix on targets where it is not the default, aiding interoperability with legacy but potentially affecting binary compatibility. When linking C code with C++, compatibility issues arise because C++ compilers mangle symbols to encode type information. To resolve this, C++ uses the extern "C" linkage specification to suppress mangling on functions intended for C interoperation, ensuring the symbols match C's undecorated names. Without this, C++-compiled code cannot directly link to C symbols, as the mangled names would not align. C's flat namespace, where global symbols must be globally unique without qualification, simplifies linking but exposes limitations in large-scale projects, such as potential name collisions when integrating multiple libraries or modules. This motivates the adoption of mangling in language extensions or successors like C++ to support overloading and scoped identifiers without clashes.

In C++

C++ employs a sophisticated name mangling scheme to encode rich semantic information about functions, classes, templates, namespaces, and (RTTI) into unique symbols, ensuring compatibility with the C linker while supporting language features like overloading and templates. The predominant scheme follows the C++ ABI, used by compilers such as and , where mangled names begin with _Z followed by an encoding that includes the function name length, the name itself, and parameter types. For instance, a void foo([int](/page/INT)) is mangled as _Z3fooi, where 3 indicates the length of "foo", and i encodes the [int](/page/INT) type. This full type encoding extends to complex cases, such as [namespace](/page/Namespace) Space { void foo([int](/page/INT)); }, which becomes _ZN5Space3fooi, incorporating the namespace scope with N and E delimiters. Template instantiations further illustrate the depth of encoding, as seen in std::vector<int>::push_back(const std::vector<int>&), mangled as _ZNSt6vectorIiE10push_backERKS0_, where St denotes the std namespace, Ii encodes the int template parameter, 10 is the length of "push_back", and ERKS0_ represents a const reference to the same type (substituted with S0). RTTI symbols, like those for type_info, also use this mangling to uniquely identify types across compilation units. Compiler implementations vary significantly, leading to compatibility challenges. GCC and LLVM/Clang adhere to the Itanium ABI, producing names like _Z3fooi, while Microsoft Visual C++ (MSVC) uses a distinct decoration scheme starting with ?, such as ?foo@@YAHH@Z for int foo(int), where @YA indicates a public static member function returning int and taking int. These differences prevent direct binary linking between Itanium-based and MSVC-compiled objects without wrappers or tools, often requiring separate builds or interoperability layers in cross-platform projects. To facilitate integration with C code or libraries, C++ provides extern "C" linkage, which suppresses mangling entirely, preserving plain names like foo for functions declared within such blocks; this is essential for mixed-language projects where C expects unmangled symbols. The C++ standards (, , and ) do not mandate a specific mangling scheme, leaving it to platform ABIs like (for systems and /) and the MSVC ABI (for Windows), which ensures portability at the source level but requires ABI-specific handling for binaries. In practice, these mangled names contribute to larger symbol tables in object files and executables, increasing binary sizes by encoding detailed type information—though optimizations like substitutions mitigate redundancy—and pose challenges due to their verbosity, often necessitating demangling for readable stack traces. Demangling tools restore human-readable forms from these encodings. In GCC and Clang ecosystems, the c++filt utility processes mangled names via command line, such as c++filt _Z3fooi outputting foo(int), and supports options like --strip-underscore for handling leading underscores. Additionally, the runtime function __cxa_demangle from <cxxabi.h> programmatically demangles strings, returning allocated buffers with the original name, as used in debuggers and custom tools. MSVC provides undname.exe for similar command-line demangling, converting ?foo@@YAHH@Z back to int __cdecl foo(int).

In Java

In Java, name mangling is primarily employed to encode nested class structures into unique identifiers for the (JVM), ensuring proper handling of class loading, , and interpretation. For inner classes, the compiler generates binary names by inserting a () between the outer class name and the inner class name, such as `OuterInnerfor a named inner class. Anonymous classes follow a similar convention but append a numeric suffix, resulting in names likeOuter$1. These mangled names are used in the constant pool of class files as CONSTANT_Utf8_infoentries, preserving the $ separator directly without further alteration beyond the standard binary name format, where package components use forward slashes (/). The resulting .class files adopt these names, e.g.,Outer$Inner.class`, to reflect the hierarchical relationship while maintaining filesystem compatibility. This mangling supports nested scopes in Java by preventing runtime name clashes across different class loaders and enabling the JVM to resolve dependencies accurately during loading and linking. In bytecode, these encoded names appear in the InnerClasses attribute (§4.7.6 of the JVM specification), which records the inner class's binary name, outer class reference, simple name (if any), and access flags, facilitating reflection operations like Class.getDeclaredClasses() to retrieve nested members without ambiguity. Without such encoding, the flat namespace of the JVM's constant pool could not distinguish between similarly named classes in different nesting contexts, potentially leading to incorrect resolution in dynamic environments. For interfacing with native code via the (JNI), Java applies additional mangling to native method names to create valid C/C++ identifiers. The fully qualified binary class name replaces dots (.) with underscores (_), and the in nested class names is escaped as the Unicode sequence _00024 to avoid conflicts with C symbol conventions. The mangled form prefixes "Java_" to the class name, appends an underscore, followed by the method name, and suffixes the method's descriptor (e.g., (I)V for int returning void). Thus, a native method `print` in class `pkg.HelloWorld` becomes `Java_pkg_HelloWorld_print`, while for an inner class method in `OuterInner, it mangles to Java_Outer_00024Inner_print`. The JNI enforces a 1:1 mapping from Java declarations to these native symbols, ensuring the dynamic linker can resolve them at runtime without overloading ambiguities. Examples illustrate this in practice: for a nested class Outer.Inner, the class file is Outer$Inner.class, and reflection via Outer.class.getDeclaredClasses() yields an array including the mangled entry. In JNI, a static native method native void HelloWorld.print(int) in package com.example generates the C function Java_com_example_HelloWorld_print(JNIEnv*, jclass, jint), where the signature (I)V is implicit in the declaration but used for lookup via GetStaticMethodID. For dynamic invocation, tools like javah (deprecated) or javac -h produce headers with these exact mangled prototypes. With the introduction of lambda expressions in 8, name mangling evolved to handle synthetic methods generated by the . Lambdas are translated into private synthetic methods with obfuscated names, typically prefixed by lambda$ and suffixed by a unique number, such as lambda$main&#36;0, to encapsulate the lambda body while avoiding conflicts with user-defined s. These synthetic methods are invoked via invokedynamic bytecode instructions, and their mangled names ensure uniqueness within the class, supporting features like and debugging without altering the JVM's core linkage model. This approach, detailed in JEP 126, maintains by treating lambdas as desugared anonymous classes internally. Subsequent versions, like 11, refined synthetic name stability for better tool support, but the core mangling pattern persists for lambda and method reference implementations.

In Python

Python employs a lightweight form of name mangling to simulate attributes in es, primarily to prevent accidental name clashes during rather than to enforce strict . Identifiers in definitions that begin with two or more characters (but do not end with two or more) are considered "" and undergo mangling by prefixing the name— with leading underscores stripped and a single underscore added— to the identifier, transforming, for instance, __foo in MyClass to _MyClass__foo. This convention signals developer intent for internal use, aligning with 's philosophy of trusting programmers while providing tools to avoid common pitfalls in subclassing. The mangling process occurs at when is transformed into , ensuring that the altered names are embedded in the compiled output for consistent resolution across implementations like , , and . Within a class body, references to mangled names automatically resolve to the transformed version, maintaining seamless internal access; for example, a calling self.__method() invokes the mangled _ClassName__method without explicit developer intervention. In inheritance scenarios, each class applies mangling based on its own name, so a subclass SubClass inheriting from BaseClass would mangle its own __bar to _SubClass__bar, distinct from _BaseClass__bar, thereby avoiding unintended overrides of superclass internals. Consider the following example:
python
class Base:
    def __init__(self):
        self.__private = "base value"
    
    def __method(self):
        return "base method"

class Sub([Base](/page/Base)):
    def __init__(self):
        [super](/page/Super)().__init__()
        self.__private = "sub value"  # Mangled to _Sub__private
    
    def access_base(self):
        return self._Base__method()  # Explicit access to base's mangled method
Here, Base.__private compiles to _Base__private, while Sub.__private becomes _Sub__private, allowing the subclass to define its own attribute without conflicting with the superclass's. To access a mangled name from outside the class, one must use the transformed form, such as obj._Base__private, demonstrating that mangling is a superficial barrier rather than robust encapsulation. This mechanism has limitations: mangling is easily reversible by reconstructing the transformed name, underscoring that it serves as a for readability and safety, not a feature against deliberate access. 's dynamic nature precludes traditional method overloading based on signatures, and mangling further ensures unique identifiers per without supporting parameterized variants. Name mangling was introduced in 1.5 in 1997 and has remained consistent, with minor adjustments in Python 3.x to handle new syntax like async def methods, ensuring they mangle correctly alongside synchronous code.

In Pascal

In Pascal dialects, name mangling supports overloaded s and functions by encoding types into the symbol name, enabling the compiler to enforce and distinguish overloads during linking. This approach contrasts with languages like , where symbols remain unmangled by default, and is essential for Pascal's type-strict semantics. Implementations such as and employ a compact mangling scheme, typically prefixing the function name with an and appending abbreviated type codes or counts, such as _FOO$1 for a foo with a single . Units serve as namespaces, with the unit name incorporated to avoid conflicts across modules. For overloaded examples, a foo(i: ) might be encoded as FOOI, while foo(s: ) becomes FOOS, ensuring uniqueness without verbose signatures. Free Pascal adopts a similar but more extensible scheme compatible with , converting routine names to uppercase and prefixing with the unit name followed by underscores and dollar signs for parameters. For instance, foo(i: ; s: ) in unit example would mangle to EXAMPLE$$FOOINTEGERSTRING, fully encoding the for overload . This verbose format aids cross-platform portability and supports advanced features like nested procedures, differing from Turbo Pascal's brevity by prioritizing explicitness for . These variations reflect evolving compiler designs: Turbo Pascal's compact style suited early DOS environments, while and emphasize compatibility and extensibility for modern development. Modifiers like alias or cdecl can disable mangling for external linkages, preserving plain names when needed.

In Fortran

In Fortran, name mangling became essential with the introduction of modules in the Fortran 90 standard, which enabled separate of related procedures and data, necessitating unique names to avoid conflicts across compilation units. Modules encapsulate subroutines, functions, and variables, and their procedures are mangled to include the module name, ensuring isolation. For instance, a subroutine named sub within a mod is typically transformed into a symbol that prefixes or suffixes the module identifier to the procedure name. Compiler implementations vary in their mangling schemes due to the lack of a standardized format in the Fortran language specifications. The GNU Fortran compiler (gfortran) uses a lowercase scheme with a double underscore prefix, _MOD_ separator, and trailing underscore for procedures: a subroutine mysub in module mymod becomes __mymod_MOD_mysub. In contrast, Intel Fortran Compiler (ifort or ifx) employs an uppercase format with _MP_ as the separator and a trailing underscore: the same subroutine mangles to MYMOD_MP_MYSUB_. These differences arise from historical conventions and can complicate linking across compilers, often requiring tools like nm to inspect symbols or explicit BIND attributes for resolution. Fortran 2003 extended support for generics through generic interfaces and parameterized derived types, allowing procedures to handle multiple kinds (e.g., single or double precision via KIND parameters) without distinct names in source code. Mangling applies to the specific procedure bindings rather than the generic name itself; overload resolution occurs at compile time based on argument types and kinds, with each specific instance retaining its module-prefixed symbol. For example, a generic interface for a procedure operating on different KIND types results in separate mangled symbols for each instantiation, such as __mymod_MOD_specific_real4 and __mymod_MOD_specific_real8 in gfortran, ensuring unique linkage while abstracting the generics in user code. The 2018 standard further enhanced coarray features for parallel programming, building on Fortran 2008's introduction of coarrays, by adding teams and improved operations, which may require additional mangling considerations for distributed symbols in multi-image executions. However, core mangling for coarray procedures follows module conventions, with no explicit coarray encoding in symbols unless specified via directives. Fortran's case insensitivity, inherited from earlier standards, influences mangling by converting identifiers to a consistent case—lowercase for gfortran and uppercase for compilers—to prevent ambiguity. For interoperability with C, the ISO_C_BINDING allows unmangling through the BIND(C) attribute, which exports symbols without Fortran-specific decoration; for example, BIND(C, NAME='sub') preserves the exact name sub for C linkage, bypassing module prefixes. This facilitates mixed-language projects but requires careful specification to align with C's case-sensitive naming.

In Rust

Rust employs a name mangling scheme in its , rustc, to generate unique symbols for functions, methods, and other items, ensuring with the backend and avoiding linker conflicts. The current v0 mangling format, introduced via RFC 2603 and stabilized in subsequent releases, is based on the C++ ABI but adapted with Rust-specific extensions for encoding language features like generics and traits. This scheme uses a reversible encoding with a _R prefix to distinguish Rust symbols, employing for identifiers longer than 63 characters and base-62 numbering for certain elements. Key features of Rust's mangling include support for generics, where type parameters are enclosed in I...E tags; traits, encoded via X for impl paths; lifetimes, represented with L followed by a numeric index; and associated types, integrated into dyn-trait bindings with D prefixes. For instance, a generic function like fn align_of<T>() in std::mem might mangle to _RINvNtC3std3mem8align_ofjE, where j encodes the type f64 and I...E wraps the generic argument. Trait methods, such as <Foo<u32> as Bar<u64>>::foo, are mangled as _RNvXINtC7mycrate3FoomEINtC7mycrate3BaryE3foo, incorporating the full impl path and trait bounds. Rust's (ABI) for internal Rust-to-Rust calls, including mangled symbols, is not guaranteed stable across versions, requiring recompilation of dependencies when updating rustc; however, the ABI for extern "C" functions remains stable since Rust 1.0 in 2015, using no mangling by default via the #[no_mangle] attribute to produce C-compatible symbols. The Rust 2021 edition introduced no major changes to the mangling scheme itself. Demangling of symbols can be performed using external tools, such as the rustc-demangle for programmatic access or the rustfilt command-line utility, which reverses the encoding to produce human-readable names for and backtraces. For FFI exports, developers apply #[no_mangle] pub extern "[C](/page/C)" fn to bypass mangling entirely, ensuring the symbol matches the declared name.

In

In Objective-C, name mangling primarily supports the dynamic runtime dispatch system by encoding symbols for classes, methods, categories, protocols, and other runtime entities in a way that ensures uniqueness in the object file's symbol table. This scheme is essential for the Objective-C runtime to locate and invoke methods at runtime, as the language relies on message passing rather than static linking for function calls. The mangling rules are defined in the compiler's implementation of the Objective-C ABI, which is consistent across GCC and Clang for compatibility with the NeXT/Apple runtime. Method implementations are mangled based on whether they are instance or methods, incorporating the name, any , and the selector. For instance methods, the is _i_, followed by the name, name (if applicable), and the selector parts concatenated with underscores replacing colons; the entire name ends with an underscore. methods use _c_ instead of _i_. For example, an instance method declared as - (void)bar:(int)i in the Foo class without a category mangles to _i_Foo_bar_. Similarly, a method + (void)methodWith:(id)arg1 arg2:(id)arg2 in the Class class mangles to _c_Class_methodWith_arg1_arg2_. This encoding allows the to resolve the implementation via the class and selector during dispatch. Class symbols are mangled as _OBJC_CLASS_<classname>, pointing to the class structure in the runtime. For instance, the NSAutoreleasePool class, a core Foundation framework component used for managing autoreleased objects, appears as _OBJC_CLASS_NSAutoreleasePool. Categories extend classes and are handled by including the category name in method mangling (e.g., _i_Foo_MyCategory_bar_ for an instance method in the MyCategory category on Foo) and dedicated symbols like _OBJC_CATEGORY_<classname>_<category> for the category data. Protocols, which define interfaces without implementations, use _OBJC_PROTOCOL_<protocolname> for their runtime structures, enabling conformance checks and optional method resolution. These conventions ensure that the flat namespace of the linker does not conflict with dynamic features like categories and protocols. Compilers like and implement this mangling uniformly to support mixed-language code, such as Objective-C++ where C++ entities use Itanium-style mangling while symbols follow the above rules; this allows seamless linking of C++, , and C code in and macOS applications. On Apple platforms, these symbols appear in binaries and are processed by the dyld , with specifics varying slightly between the legacy 32-bit and the modern 64-bit introduced in macOS 10.5 and mandatory on . The evolution of mangling has been stable, but the introduction of blocks in /macOS 10.6 (2010) added support for __Block_byref structures, mangled as struct __Block_byref_<varname>_<index> for captured variables in literals, enabling copy and strong/weak capture semantics. (ARC), adopted in 2011, did not alter core method or class mangling but integrated blocks more tightly with retain/release operations via helpers, maintaining with code.

In Swift

Swift's name mangling scheme encodes symbols in a type-safe manner to ensure uniqueness across overloads, generics, and other language features, producing compact strings that begin with a prefix such as $s for stable ABI symbols. This encoding uses a postfix notation with single-character or multi-character operators to represent entities like modules, functions, types, and contexts; for instance, a function named func in a module named module returning an Int and taking an Int parameter might be mangled as $s7module4funcyS2iSi, where 7module encodes the module name length and content, 4func the function name, y indicates no parameters in the mangled form (with types following), and Si represents the Int type. The scheme handles complex types explicitly: optionals are encoded with the Sg suffix (e.g., Int? as SiSg), tuples use a t operator followed by their element types (e.g., (Int, String) as Si_Ss5tSi_t), and other constructs like arrays employ generic bounds (e.g., Array<Int> as SaSiG). A example illustrates this for a simple : func add(_ a: [Int](/page/INT), _ b: [Int](/page/INT)) -> [Int](/page/INT) in a named add mangles to $s4add3add4a_4bS2iSi, where 4add is the , 3add the base name, 4a_4b the labeled , and S2iSi the and types (S denotes a type, i for [Int](/page/INT)). Generic specializations receive unique manglings to distinguish instantiations, using substitution indices (e.g., A with numbers) for repeated types, ensuring that each form has a distinct symbol without redundancy. This approach supports ABI stability, which was achieved in 5 released in 2019, allowing binary compatibility across versions on Apple platforms by fixing the mangling rules and incorporating standard libraries into the OS. Prior to this, manglings used unstable prefixes like _T0 in 1.0 (2014), which evolved to support library evolution and resilience without breaking binaries. For demangling, the swift-demangle command-line tool, included in the Swift toolchain, reverses these encodings to produce human-readable names, aiding of symbols in binaries or stack traces. Interoperability with involves unmangling selectors for @objc-marked Swift entities, allowing seamless calls where Swift methods appear as standard Objective-C selectors without additional decoration. Similarly, for C interoperability, Swift provides C-compatible linkage via attributes like @_cdecl, which suppresses mangling to expose functions with plain names, enabling direct calls from C code as if they were extern "C". Imported C types are placed in the __C module and mangled accordingly (e.g., a C struct CxxStruct as So9CxxStructC).

References

  1. [1]
    Decorated names | Microsoft Learn
    Jun 15, 2022 · This name decoration, also known as name mangling, helps the linker find the correct functions and objects when linking an executable.
  2. [2]
    Itanium C++ ABI
    Summary of each segment:
  3. [3]
  4. [4]
    Itanium C++ ABI
    In this document, we specify the Application Binary Interface (ABI) for C++ programs: that is, the object code interfaces between different user-provided C++ ...Chapter 1: Introduction · Chapter 2: Data Layout · Chapter 3: Code Emission and...
  5. [5]
    [PDF] Embedded System Tools Reference Manual
    Jun 23, 2006 · Name mangling is a concept unique to C++ and other languages that support overloading of symbols. A function is said to be overloaded if the ...
  6. [6]
  7. [7]
    The Secret Life of C++: Symbol Mangling - MIT
    Mangled names always start with _Z . Following this is the mangled name of either a function and its type, an object name, or a "special name". There are a ...
  8. [8]
    -qnamemangling (C++ only) - IBM
    Chooses the name mangling scheme for external symbol names generated from C++ source code. The option and pragma are provided to ensure binary compatibility.
  9. [9]
    [PDF] A History of C++: 1979− 1991 - Bjarne Stroustrup
    Jan 1, 1984 · This paper outlines the history of the C++ programming language. The emphasis is on the ideas, constraints, and people that shaped the ...
  10. [10]
  11. [11]
  12. [12]
  13. [13]
  14. [14]
  15. [15]
  16. [16]
    ABI Policy and Guidelines
    ### Summary of Name Mangling, ABI Policy, Versioning, and ABI Breaks in libstdc++
  17. [17]
  18. [18]
    __stdcall | Microsoft Learn
    Feb 10, 2025 · Name-decoration convention, An underscore ( _ ) is prefixed to the name. The name is followed by the at sign ( @ ) followed by the number of ...Missing: _foo@ 8
  19. [19]
  20. [20]
  21. [21]
  22. [22]
  23. [23]
  24. [24]
    Chapter 28. Demangling
    ### Summary of Demangling from https://gcc.gnu.org/onlinedocs/libstdc++/manual/ext_demangling.html
  25. [25]
    D's Newfangled Name Mangling - D Programming Language
    Dec 20, 2017 · The two definitions of find shown above can coexist in D and C++, so name mangling is not only a way to detect errors at link time but also a ...
  26. [26]
    Debugging C Plus Plus (Debugging with GDB) - Sourceware
    See Symbols, for a more complete description of the demangle command. set print demangle; show print demangle; set print asm-demangle; show print asm-demangle.
  27. [27]
    Code Gen Options (Using the GNU Compiler Collection (GCC))
    This option and its counterpart, -fno-leading-underscore , forcibly change the way C symbols are represented in the object file. One use is to help link with ...
  28. [28]
    gcc(1) - Linux manual page - man7.org
    -fleading-underscore This option and its counterpart, -fno-leading-underscore, forcibly change the way C symbols are represented in the object file. One use ...
  29. [29]
    GCC and Make - A Tutorial on how to compile, link and build C/C++ ...
    "nm" Utility - List Symbol Table of Object Files. The utility " nm " lists symbol table of object files. For example, $ nm hello.o 0000000000000000 b .bss
  30. [30]
    Clang command line argument reference - LLVM
    This page lists the command line arguments currently supported by the GCC-compatible clang and clang++ drivers.<|control11|><|separator|>
  31. [31]
  32. [32]
  33. [33]
    MSVC compatibility — Clang 22.0.0git documentation - LLVM
    When Clang compiles C++ code for Windows, it attempts to be compatible with MSVC. There are multiple dimensions to compatibility.
  34. [34]
    [lldb-dev] Huge mangled names are causing long delays when ...
    Jan 24, 2018 · ... I have an issue where I am debugging a C++ binary that is around 250MB in size. It contains some mangled names that are crazy ...<|control11|><|separator|>
  35. [35]
    Nested Classes - Learning the Java Language
    A nested class is a member of its enclosing class. Non-static nested classes (inner classes) have access to other members of the enclosing class, even if they ...Missing: mangling | Show results with:mangling
  36. [36]
  37. [37]
  38. [38]
    [PDF] Inner Classes Specification - Java Community Process
    The code of an inner class can use simple names from enclosing scopes, including both class and instance members of enclosing classes, and local variables of ...
  39. [39]
    Java Native Interface Specification: 2 - Design Overview
    Resolving Native Method Names. The JNI defines a 1:1 mapping from the name of a native method declared in Java to the name of a native method residing in a ...JNI Interface Functions and... · Compiling, Loading and... · Accessing Java Objects
  40. [40]
    JEP 126: Lambda Expressions & Virtual Extension Methods
    ### Summary of Lambda Expressions in Java 8 (JEP 126)
  41. [41]
  42. [42]
  43. [43]
  44. [44]
    Chapter 5: Expressions - MIT
    Private name mangling: when an identifier that textually occurs in a ... Python 1.5 Reference Manual - 13 JAN 1998. [Next] [Previous] [Top] [Contents] ...
  45. [45]
  46. [46]
    Name mangling - Free Pascal
    This is done so that the compiler can do stronger type checking when parsing the Pascal code. It also permits function and procedure overloading.Missing: Turbo | Show results with:Turbo
  47. [47]
    OBJTYPENAME directive (Delphi) - RAD Studio
    As an internal directive, to tell the Delphi compiler how to mangle a type's name. (Since the mangling must match that of the C++ compiler, this option must ...
  48. [48]
    Linking to Functions in Third-Party DLLs - Embarcadero Blogs
    Jul 31, 1997 · This document discusses two methods for linking to functions in a DLL: 1) import libraries, and 2) module definition files.
  49. [49]
    Mangled names for code blocks - Free Pascal
    6.2.2 Mangled names for code blocks. The rules for mangled names for routines are as follows: All routine names are converted to upper case.
  50. [50]
    [PDF] An introduction to programming in Fortran 90
    Modules. Modules are a powerful concept in Fortran 90 and are particularly useful for programming projects that are large or that involve several people.
  51. [51]
    GfortranFAQ - GCC Wiki
    Jun 11, 2012 · Module name mangling: Subroutine Mysub in module Mymod has the name __mymod_MOD_mysub. Same is valid for functions and globals. Note the double ...Missing: ifort | Show results with:ifort
  52. [52]
    Intel Visual Fortran Compiler name mangling, is my ... - Stack Overflow
    Sep 26, 2012 · I am using Intel Visual Fortran Composer XE 2011 to build my Fortran project in MS Visual Studio 2008. I am getting linker errors: LNK2019 unresolved external ...Compiler agnostic fortran name mangling function - Stack OverflowNaming of symbols in Fortran shared library, intel vs gcc?More results from stackoverflow.com
  53. [53]
    [PDF] The New Features of Fortran 2003
    A generic name may be the same as a derived type name, provided it references a function. This has the effect of overriding or overloading the constructor for ...
  54. [54]
    [PDF] The new features of Fortran 2018
    Aug 2, 2018 · The appearance of an association does not prevent the coarray being referenced by its original name and with its original cosubscripts and ...
  55. [55]
    Name mangling - Wikipedia
    Name mangling (also called name decoration) is a technique used to solve various problems caused by the need to resolve unique names for programming entities.Examples · C · C++ · Java
  56. [56]
    ISO_C_BINDING - The GNU Fortran Compiler
    The ISO_C_BINDING module provides the following named constants of type default integer, which can be used as KIND type parameters. In addition to the integer ...
  57. [57]
    2603-rust-symbol-name-mangling-v0 - The Rust RFC Book
    This RFC proposes a new mangling scheme that describes what the symbol names generated by the Rust compiler look like.
  58. [58]
    v0 Symbol Format - The rustc book - Rust Documentation
    The v0 mangling format was introduced in RFC 2603. It has the following properties: It provides an unambiguous string encoding for everything that can end up in ...
  59. [59]
    FFI - The Rustonomicon - Rust Documentation
    The no_mangle attribute turns off Rust's name mangling, so that it has a well defined symbol to link to. Then, to compile Rust code as a shared library that ...<|control11|><|separator|>
  60. [60]
    Rust 2021 - The Rust Edition Guide
    The Rust 2021 Edition contains several changes that bring new capabilities and more consistency to the language, and opens up room for expansion in the future.
  61. [61]
    rustc-demangle - crates.io: Rust Package Registry
    Jul 27, 2025 · rustc-demangle. Demangling for Rust symbols, written in Rust. Documentation. Usage. You can add this as a dependency via your Cargo.toml
  62. [62]
    Kai Henningsen - Objective-C name (de)mangling - GNU
    (Correct?) Objective-C name mangling is fairly simple - perhaps too simple: it is not always reversible in the face of original identifiers with underscores ...<|control11|><|separator|>
  63. [63]
    lib/Frontend/Rewrite/RewriteObjC.cpp Source File - Clang
    3036 std::string Name = "_OBJC_PROTOCOL_" + Exp->getProtocol()->getNameAsString(); ... An expression that sends a message to the given Objective-C object or class ...
  64. [64]
    Block Implementation Specification — Clang 22.0.0git documentation
    Under ObjC we allow __weak as an attribute on __block variables, and this causes the addition of BLOCK_FIELD_IS_WEAK orred onto the BLOCK_FIELD_IS_BYREF flag ...Missing: mangling | Show results with:mangling
  65. [65]
    None
    Nothing is retrieved...<|control11|><|separator|>
  66. [66]
    Swift 5 Released!
    Mar 25, 2019 · The ABI is now declared stable for Swift 5 on Apple platforms. As a result, the Swift libraries are now incorporated into every macOS, iOS, tvOS ...
  67. [67]
    Debugging Memory Leaks and Usage - Swift.org
    Note: swift demangle is a Swift command line utility and should be available if you have the Swift toolchain installed. The utility will demangle the symbol and ...
  68. [68]
    [Pitch] Formalize @cdecl - Swift Forums
    Apr 28, 2025 · Hi everyone! Here's a pitch to formalize @cdecl to identify global functions as callable from C code, with supporting features.