Friday, August 6, 2010

Name Mangling in C++

When C++ compilers compile a C++ program, it encodes all function names and certain other identifiers to include type and scoping information. This encoding process is called name mangling. Linker uses these mangled names to ensure type-safe linkage. These mangled names appear in the object files and final executable file.

What's a symbol?

In every C++ program/library/object file, all non-static functions are represented in the binary file as symbols. These symbols are special text strings that uniquely identify a function in the program, library or object file

The Need for Name Mangling:

C language programs does not use name mangling, because in C no two non-static functions can have the same name. i.e., the symbol name is the same as the function name: the symbol of myfunc will be myfunc
Because C++ allows overloading (different functions with the same name but different number of arguments) and has many features C does not, like classes, member functions, exception specifications — it is not possible to simply use the function name as the symbol name. To solve that, C++ uses name mangling, which encodes the function name and all the necessary information (like the number and size of the arguments) into some special string which only the compiler knows about

eg.,
bpte4500s001:/sunbuild1/giri/testcases/% nm hide.o

hide.o:

[Index]   Value      Size    Type  Bind  Other Shndx   Name
[3]     |        16|      56|FUNC |GLOB |3    |2      |__1cKCRectangleKset_values6Mii_v_
[4]     |         0|       0|NOTY |GLOB |0    |ABS    |__fsr_init_value
[1]     |         0|       0|FILE |LOCL |0    |ABS    |hide.cpp
[2]     |        88|      32|FUNC |GLOB |2    |2      |main
"__1cKCRectangleKset_values6Mii_v_" is the mangled name

But this kind of scheme is undesirable for the developers because the names are difficult to read & debug

Two utilities are available with Sun Studio C/C++ compiler collection to convert the mangled names to their original source code names:
1) c++filt &
2) dem

C++filt is a filter that demangles (decodes) mangled names.
bpte4500s001% echo __1cKCRectangleKset_values6Mii_v_ | c++filt
void CRectangle::set_values(int,int)

"dem" is another utility to demangle C++ names
bpte4500s001% dem __1cKCRectangleKset_values6Mii_v_
__1cKCRectangleKset_values6Mii_v_ == void CRectangle::set_values(int,int)

Note:
C++ standard does not define how names have to be mangled; thus every compiler mangles names in its own way. Some compilers even change their name mangling algorithm between different versions. This could be a problem if the developers hack & rely on how compiler mangles the C++ symbols, as the same algorithm may not work with the next version of C++ compiler

No comments:

Post a Comment