Friday, August 6, 2010

Name Mangling in C++

When C++ compilers compile a C++ program, it encodes all function names and certain other identifiers to include type and scoping information. This encoding process is called name mangling. Linker uses these mangled names to ensure type-safe linkage. These mangled names appear in the object files and final executable file.

What's a symbol?

In every C++ program/library/object file, all non-static functions are represented in the binary file as symbols. These symbols are special text strings that uniquely identify a function in the program, library or object file

The Need for Name Mangling:

C language programs does not use name mangling, because in C no two non-static functions can have the same name. i.e., the symbol name is the same as the function name: the symbol of myfunc will be myfunc
Because C++ allows overloading (different functions with the same name but different number of arguments) and has many features C does not, like classes, member functions, exception specifications — it is not possible to simply use the function name as the symbol name. To solve that, C++ uses name mangling, which encodes the function name and all the necessary information (like the number and size of the arguments) into some special string which only the compiler knows about

eg.,
bpte4500s001:/sunbuild1/giri/testcases/% nm hide.o

hide.o:

[Index]   Value      Size    Type  Bind  Other Shndx   Name
[3]     |        16|      56|FUNC |GLOB |3    |2      |__1cKCRectangleKset_values6Mii_v_
[4]     |         0|       0|NOTY |GLOB |0    |ABS    |__fsr_init_value
[1]     |         0|       0|FILE |LOCL |0    |ABS    |hide.cpp
[2]     |        88|      32|FUNC |GLOB |2    |2      |main
"__1cKCRectangleKset_values6Mii_v_" is the mangled name

But this kind of scheme is undesirable for the developers because the names are difficult to read & debug

Two utilities are available with Sun Studio C/C++ compiler collection to convert the mangled names to their original source code names:
1) c++filt &
2) dem

C++filt is a filter that demangles (decodes) mangled names.
bpte4500s001% echo __1cKCRectangleKset_values6Mii_v_ | c++filt
void CRectangle::set_values(int,int)

"dem" is another utility to demangle C++ names
bpte4500s001% dem __1cKCRectangleKset_values6Mii_v_
__1cKCRectangleKset_values6Mii_v_ == void CRectangle::set_values(int,int)

Note:
C++ standard does not define how names have to be mangled; thus every compiler mangles names in its own way. Some compilers even change their name mangling algorithm between different versions. This could be a problem if the developers hack & rely on how compiler mangles the C++ symbols, as the same algorithm may not work with the next version of C++ compiler

Thursday, August 5, 2010

Interview qeustions on C/C++

1. What is the size of the structure, assume int to be 4 bytes and float to be 4 bytes.

struct {
    enum  {
        integer,
        floating
    } type;
    union {
        int a;
        float b;
    };
} t1;

Answer: The enum would take one integer (4) and other union together will take one more space of larget between two (4), so total 8 bytes.

suppose the structure delcaration is changes as below, then what would be the size?

struct {
    enum type {
        integer,
        floating
    } ;
    union {
        int a;
        float b;
    };
} t1;

Answer: This would result in compilation error, as there is no member of 'type'

2. What doe the following function do? can it be given a better name?

DT* func(DT* a, const DT* b)
{
    DT *c = a;
    while(*a++ = *b++);
    return c;
}

Answer: Assuming that DT is an integer or char, the function would copy data from b to a, untill a zero in b is obtained, zero is also copied, so it can be renamed as string copy function.

3. Is the out put of line one and two inside the loop same, or different?

int a[100][100];
for(int i = 0; i<100; i++) {
    for(int j = 0; j<100; j++) {
        printf("%d", a[i][j]);
        printf("%d", a[j][i]);
    }
}

Answer: The output would be different as they refer to different row and colum, but at the end they out put all the data in the two dimentional array.

If one needs to be prefered over the other, which one will you prefer? and why?

Answer: The first one will be faster as it has less cache miss in case of a cache enabled system compared to the second one, so in a practical system the using first line would be faster, and so it should be prefered. Theoretically if memory access does not use cache, then both will be same in terms of time taken.

4. What is function overloading? how does compiler will know which function to call?

class{ 
    void func(int a);
    void func(int a, int b);
}

Answer: Having many function definitions for the same functions with different number of arguments is called function overloading, and the compiler will decide about the functions by seeing the type and number of params passed to the function call, and decide on which function to call at compile time.

Suppose if the function is defined below with a default argument and when calling if it is not passed then how would compiler decide about function?

class{ 
    void func(int a);
    void func(int a, int b = 0);
}

Answer: Then the compiler wont be able to decide, and it will throw a compile time error.

Wednesday, August 4, 2010

First bit set in an Integer

Write a C code which returns the position of the first bit set.fot eg. for number 104(1101000) output will be 4.

You need to right shift the number untill the number becomes zero and count the number of times right shifted, then you subtract it from the size of an interger (32 bit on a 32 bit machine), you will get the answer. Here is the code for your reference.

int fbs(unsigned int num)
{
    int pos = -1;
    while(num!=0)
    {
        num <<= 1;
        pos++;
    }
    return 32 - pos;
}

Mod 16 without using arithmatic

Write a function which takes an integer value as an argument and return its mod 16 value without using these (%,+,_,/) arithmetic operations

int mod16(int a)
{
    return a>>4;
}

The 4 bits left shift will give mod 16.

what is the difference between dynamic and static linking? which is better?

A dynamic-link library (DLL) is an executable file that acts as a shared library of functions. Dynamic linking provides a way for a process to call a function that is not part of its executable code. The executable code for the function is located in a DLL, which contains one or more functions that are compiled, linked, and stored separately from the processes that use them. DLLs also facilitate the sharing of data and resources. Multiple applications can simultaneously access the contents of a single copy of a DLL in memory.

Dynamic linking differs from static linking in that it allows an executable module (either a .dll or .exe file) to include only the information needed at run time to locate the executable code for a DLL function. In static linking, the linker gets all of the referenced functions from the static link library and places it with your code into your executable.

Using dynamic linking instead of static linking offers several advantages. DLLs save memory, reduce swapping, save disk space, upgrade easier, provide after-market support, provide a mechanism to extend the MFC library classes, support multilanguage programs, and ease the creation of international versions.

In case of free software licenses also, there are provisions with LGPL, using which you dont have to share the code which are linked dynamically.