An examination of C++ virtual functions

Virtual functions are a key feature of C++ to enable runtime polymorphism. This post is my attempt in understanding how they are implemented and executed at runtime. The compiler used is GCC 5.4.0 on Ubuntu 16.04.

Here is a simple program that uses virtual functions that we will use as an example:

To aid us in understanding what this code is compiled into, we request GCC to add debugging information (using option -g) when we compile it:

$ g++ -g virtual_function_example.cpp
$ ./a.out
In B

Almost all C++ compilers implement virtual functions by using virtual tables, more commonly called as vtables. This is a table of function addresses, one for each virtual function in the class. One virtual table is created for each class that has virtual functions.

We can see the existence of the methods and virtual tables of each class and their addresses by examining the binary:

$ readelf --symbols a.out | c++filt | grep -E "vtable|A::|B::"

    86: 0000000000400936    11 FUNC    WEAK   DEFAULT   14 A::do_something()
    81: 0000000000400942    30 FUNC    WEAK   DEFAULT   14 A::do_something2()
    87: 0000000000400960    11 FUNC    WEAK   DEFAULT   14 B::do_something()
    84: 000000000040096c    30 FUNC    WEAK   DEFAULT   14 B::do_something2()
    60: 000000000040098a    23 FUNC    WEAK   DEFAULT   14 A::A()
    69: 00000000004009a2    39 FUNC    WEAK   DEFAULT   14 B::B()
    92: 0000000000400a68    32 OBJECT  WEAK   DEFAULT   16 vtable for B
    63: 0000000000400a88    32 OBJECT  WEAK   DEFAULT   16 vtable for A

Here we use the readelf program to extract the symbols from the binary. The symbols are in mangled form that is difficult to decipher for humans. So, we pipe it through a demangler.

Here is the output I got on my computer:

(Click to enlarge)

We can check which sections of virtual memory the class methods and virtual tables will be loaded into by examining the sections of the binary:

$ readelf --sections a.out
There are 37 section headers, starting at offset 0x6b78:

Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [..]
  [14] .text             PROGBITS        00000000004007a0 0007a0 0002a2 00  AX  0   0 16
  [..]
  [16] .rodata           PROGBITS        0000000000400a50 000a50 00008b 00   A  0   0  8
  [..]

Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), l (large)
  I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)

We can cross-examine the addresses of the class methods and virtual tables with the starting addresses and sizes of the sections. We see that the class methods will be loaded into the .text section and the virtual tables into the .rodata segment. The flags of these sections indicate that only the .text section is executable, as it should be.

(Click to enlarge)

Finally, let us examine how the virtual tables are used at runtime to determine which method to execute. To do this, we disassemble the binary instructions in the binary:

$ objdump --disassemble --demangle --source a.out

int main()
{
  400896:       55                      push   %rbp
  400897:       48 89 e5                mov    %rsp,%rbp
  40089a:       53                      push   %rbx
  40089b:       48 83 ec 18             sub    $0x18,%rsp
    A* a = new B();
  40089f:       bf 08 00 00 00          mov    $0x8,%edi
  4008a4:       e8 d7 fe ff ff          callq  400780 <operator new(unsigned long)@plt>
  4008a9:       48 89 c3                mov    %rax,%rbx
  4008ac:       48 c7 03 00 00 00 00    movq   $0x0,(%rbx)
  4008b3:       48 89 df                mov    %rbx,%rdi
  4008b6:       e8 e7 00 00 00          callq  4009a2 <B::B()>
  4008bb:       48 89 5d e8             mov    %rbx,-0x18(%rbp)
    a->do_something2();
  4008bf:       48 8b 45 e8             mov    -0x18(%rbp),%rax
  4008c3:       48 8b 00                mov    (%rax),%rax
  4008c6:       48 83 c0 08             add    $0x8,%rax
  4008ca:       48 8b 00                mov    (%rax),%rax
  4008cd:       48 8b 55 e8             mov    -0x18(%rbp),%rdx
  4008d1:       48 89 d7                mov    %rdx,%rdi
  4008d4:       ff d0                   callq  *%rax

  4008d6:       b8 00 00 00 00          mov    $0x0,%eax
    return 0;
  4008db:       48 83 c4 18             add    $0x18,%rsp
  4008df:       5b                      pop    %rbx
  4008e0:       5d                      pop    %rbp
  4008e1:       c3                      retq

From the output of objdump, only the disassembly of the main function is shown above. In the above command, we have requested objdump to --disassemble the binary code to assembly code, to --demangle the symbol names to human readable form and to annotate the disassembly with the original C++ --source statements.

By examining the disassembled code, the runtime mystery is revealed. We need to note that every object of a class, that has virtual methods, stores a pointer to its class virtual table. On a 64-bit computer, this means that objects of such classes need extra space of 8 bytes. This pointer is placed at the beginning of the memory layout of the object, even before other members of the object.

When you call a virtual method in C++ code, the compiler generates these instructions:

  • Jump to the beginning of the object. This is a location on the heap or stack, depending on how the object was created. This is where a pointer to its class virtual table is stored.
  • Jump to the start of the class virtual table. This is a location in the .rodata section of the process virtual memory, as we noted earlier.
  • Depending on which virtual method is needed, jump to that entry in the virtual table. This entry has the address of that virtual method.
  • Finally, jump to the address of the virtual method and start executing its instructions. This is in the .text section of the process virtual memory.

Here is an illustration of the code disassembly:

(Click to enlarge)
Advertisements

Reference to pointer in C++

Reference to pointer is a useful construct to be aware of in C++. As you might already know, the C++ language allows you to take a reference to any object that is not a temporary. You can take a const reference to any object, including a temporary. So what is special about reference to a pointer?

A common construct in C++ is to receive a pointer to a pointer as input argument to a function. This is typically used to allocate a primitive or an object inside the function. The allocated primitive or object will be available in the caller after the function is called and done with. This idiom is common with C programmers who have moved to C++. It is such code that becomes a lot cleaner and easier to write and read if a reference to pointer is used as input argument to function. As a bonus, since it is a reference it has to always refer to a pointer that actually exists.

The code example below shows the difference between pointer to pointer and reference to pointer:

Note how reference-to-pointer is cleaner to understand both at the caller location and inside callee.

Always make base class destructor as virtual in C++

TLDR: The title of this post says it all!

If you are having a class hierarchy, with base class and derived classes, then try to always make the base class destructor as virtual.

I recently noticed an application having a serious memory leak after merging some code. Other than the leak, everything else about the code was executing fine! After debugging the code, the culprit turned out to be a base class destructor that was not virtual. If only the above rule had been followed diligently, the error would have been caught easily.

Why this rule? The reason for this rule is pretty simple. A derived class destructor might be deallocating objects or freeing memory that it had allocated earlier during its creation or execution. Now think about the scenario where this derived class object is held using a base class pointer and it is freed.

  • If base class destructor is not virtual: Only the base class destructor is called, thus causing a memory leak.

  • If base class destructor is virtual: The derived class destructor is called first (thus freeing its allocated objects correctly) before the trail of destruction heads up the chain of hierarchy, ending in the base class destructor. This is the intended correct behavior.

Here is a code example that illustrates this scenario:

override and final in C++

override and final are two new qualifiers introduced in C++11. The use of these qualifiers is optional. They are meant to be used with virtual methods to show the intention of the method. The compiler uses these qualifiers to check if your intention matches the actual ground truth in your code and throws a compile error if it does not. Thus, it helps to catch bugs earlier at compile time.

  • When you specify override for a method, you are indicating to the compiler that this is a virtual method and it is overriding a virtual method with the same signature in one of the base classes that the current class inherits from. If the method is not inheriting from any virtual method with the same signature, the compiler throws an error. Thus if you made a mistake in the function signature while defining this method, you would not have caught it unless you used this qualifier.

  • When you specify final for a method, you are indicating that this is a virtual method and that no class that inherits from the current class can override this method. If any method tries to override this method in an inherited class, the compiler throws an error.

  • If override or final are used with non-virtual methods, the compiler throws an error.

  • These qualifiers are specified after the function input arguments and should be specified after const if the virtual method is a const method. If you put these qualifiers before a const, you will get a weird error with GCC that gives no hint that this is because of the order of qualifiers is wrong!

  • These qualifiers are to be specified only with the method declaration. If you try to use them with the method definition, the compiler will throw an error.

  • You can specify override final for a method, but it is the same as using final.

  • override is not allowed to be used with the base virtual method. This is for the obvious reason that the base virtual method is the first virtual method and it is not overriding any other method.

  • final can be used with the base virtual method. This can be used to specify that the first base virtual method cannot be overridden in any inherited class.

This code example shows how to use these qualifiers:

Visual Studio Code extensions that I use

  • CPP Tools: The official extension for working with C++ code. Automatically indexes all code in the currently open directory, offers auto-completion and syntax highlighting.

  • Python by Don Jayamanne: There are many Python extensions, but this seems to be the most popular one. Syntax highlighting, indexing and code completion.

  • Vim: There are many Vim extensions, but this seems to be the most popular one. It has entire universes to traverse before it can be as good as Vrapper, the Vim extension for Eclipse. This VSCode extension offers very basic navigation and editing commands.

  • Git Blame: This extension does one little thing that I need everyday to work with code from other people: know who modified a line of code. This extension shows that for the current line in the status bar.

  • Matlab: I need to regularly browse through some MATLAB files. This extension offers syntax highlighting of Matlab files.

Tried with: Visual Studio Code 1.4 and Ubuntu 16.04

Notes of talk: C++ in the 21st century

I recently came across a 2014 talk by Arvid Norberg about the new features in C++11. The video is here and slides are here.

C++ is huge and getting bigger every day. So, I keep discovering interesting new features that I like to note down for use in my own code. Below are my notes from this talk. I do not note aspects that I already know well. This talk has examples that are small but illustrative, so if you hit any of these features, you should see the video to look at the examples.

For loops

  • std::begin and std::end work on C arrays too. Note that this is only when the array size is known. So, the array must have been created in the same local scope.

decltype

  • decltype deduces the type of an expression. So its use is in type expressions. For example, as template arguments.
// vector of the return type of function f
std::vector<decltype(f())> vals;
  • Internally, it is used by auto to deduce type of expression

lambda functions

  • Lambda expression yields an unnamed function object. The tiny examples in the talk are good.

override

  • This is to help programmers find errors. For example, when virtual method in base class is not const and in derived it is. Programmer might miss this error. If virtual method in derived class is declared override and it is actually not, compiler will complain.

unique_ptr

  • This smart pointer is not copyable, but movable. It is deleted when pointer goes out of scope.
  • Many functions create a heap-allocated object and return it. Traditionally, programmers had to worry about the ownership and lifetime of such a returned object. Return it as unique_ptr and forget about these worries.
  • Also great for storing such heap-allocated objects in containers.

error_code

This C++11 feature was something new to me! I did not understand how to apply it either. I might need to study this in future.

  • error_code represents an error. It has error_value integral value indicating what is the error. It has category indicating domain of error value.
  • category is an abstract base class implementing conversion of error_value to human readable message.

chrono

  • There are a whole bunch of old C, C++, Unix and POSIX time functions. They are not platform agnostic, have low time resolutions, have no type safety (milliseconds value can be passed to a function that takes in microseconds and so on) and are not monotonic. Monotonic in this context means that if you measure a time before DST is turned on and after it, the latter value should always be larger, though the wall clock may have been turned back by DST.
  • Chrono introduces a clock with its own epoch (start of life) and its own resolution.
  • time_point: A point in time relative to epoch. It has its resolution encoded inside it.
  • time_duration: Difference of two time points. It has its resolution encoded inside it.
  • Because these types have their resolutions embedded inside, two durations of different resolutions can be added together to produce a duration that has resolution that is highest or higher than both. They can be passed to function that accepts in a different resolution. The template machinery ensures that it all converts correctly.

EXIT_SUCCESS and EXIT_FAILURE in C and C++

Even after many years of working with C and C++, I continue to make new discoveries! All these years I had returned 0 from the main function on success and a non-zero value (almost always 1) on failure. Somewhere at the back of my head it had always troubled me that there was no standardization on these return values and that I was returning what were essentially magic numbers.

I recently discovered that though the C and C++ languages do not have anything to say about this issue (like they rightly should not), the C and C++ standard libraries do provide EXIT_SUCCESS and EXIT_FAILURE. These can be used to return from the main function. These are defined in the stdlib.h and cstdlib headers for C and C++ respectively.

Curious to see what values they represented, I looked up /usr/include/c++/5/cstdlib and found this:

#define EXIT_SUCCESS 0
#define EXIT_FAILURE 1

Tried with: GCC 5.2.1 and Ubuntu 15.10

How to speed up recompilation with ccache

C++ compilation of large projects takes forever. One trick to speed it up is to cache previous compilations. When a compilation unit and its compilation options exactly match an earlier one, the result from the cache can be used directly. Such a compilation cache can reduce compilation times enormously (by orders of magnitude) on a machine where you build several times a day. CCache is an implementation of such a compilation cache for C and C++ compilation using GCC compilers.

  • Installing it is easy:
$ sudo apt install ccache
  • The ccache man page suggests replacing the symlinks of gcc, g++, cc and c++ with symlinks to /usr/bin/ccache. This works, but is an onerous method.

  • The method I like is to just add /usr/lib/ccache to the front of your PATH environment variable. This directory has symlinks named for all the GCC compilers and they point to the ccache binary.

  • Once you finished either of the above two methods, that is it! You can just run your builds as usual. The first time your builds will take the usual time, but from the second time you should be able to witness enormous speedups.

  • To check details of the cache, such as its size, how much is occupied, number of cache hits and misses:

$ ccache --show-stats
  • If you feel that reuse of compilation from cache is causing some weird compilation or linking problems, then you can clear out the cache:
$ ccache --clear

CCache is so easy to use and its benefits are so bountiful that I highly recommend using it if you are a C or C++ programmer.

(Thanks to this post for introducing me to ccache and you can also find some speedup metrics of using ccache in it.)

Tried with: CCache 3.1.9 and Ubuntu 14.04

Command-line options for Google Test program

Google Test can be used to write unit tests for C++ programs. Some of the useful command-line options and flags available to such a program are:

  • --gtest_list_tests: Lists the names of all the tests available.

  • --gtest_filter=pattern: Run only the tests whose names match the regex pattern. For example: `–gtest_filter=”ConvolutionLayerTest.*”

  • --gtest_shuffle: Run unit tests in random order.

Invalid MEX-file error in Matlab

Problem

Running a Matlab script threw this error:

Invalid MEX-file '/home/joe/do_work.mexa64':/usr/local/matlabR2014b/bin/glnxa64/../../sys/os/glnxa64/libstdc++.so.6: version 'GLIBCXX_3.4.19' not found (required by /home/joe/do_work.mexa64).

This error is surprising since do_work.mexa64 was built using Matlab using the same system and same environment settings.

Solution

There is a difference in the glib version between the compiled and running environment. When the do_work.mexa64 was built, the libstdc++.so.6 from /usr/lib/x86_64-linux-gnu/ was used since it was first in LD_LIBRARY_PATH. However, when the script is run by Matlab, it picks up its own libstdc++.so.6 file as explained here!

A simple solution to this problem is to remove or rename the Matlab version of libstdc++.so.6, so that it is not picked up.

Tried with: Ubuntu 14.04