How to speed up recompilation with ccache

C++ compilation of large projects takes forever. One trick to speed it up is to cache previous compilations. When a compilation unit and its compilation options exactly match an earlier one, the result from the cache can be used directly. Such a compilation cache can reduce compilation times enormously (by orders of magnitude) on a machine where you build several times a day. CCache is an implementation of such a compilation cache for C and C++ compilation using GCC compilers.

  • Installing it is easy:
$ sudo apt install ccache
  • The ccache man page suggests replacing the symlinks of gcc, g++, cc and c++ with symlinks to /usr/bin/ccache. This works, but is an onerous method.

  • The method I like is to just add /usr/lib/ccache to the front of your PATH environment variable. This directory has symlinks named for all the GCC compilers and they point to the ccache binary.

  • Once you finished either of the above two methods, that is it! You can just run your builds as usual. The first time your builds will take the usual time, but from the second time you should be able to witness enormous speedups.

  • To check details of the cache, such as its size, how much is occupied, number of cache hits and misses:

$ ccache --show-stats
  • If you feel that reuse of compilation from cache is causing some weird compilation or linking problems, then you can clear out the cache:
$ ccache --clear

CCache is so easy to use and its benefits are so bountiful that I highly recommend using it if you are a C or C++ programmer.

(Thanks to this post for introducing me to ccache and you can also find some speedup metrics of using ccache in it.)

Tried with: CCache 3.1.9 and Ubuntu 14.04

Advertisements

Prime number buckets in hash table implementations

Prime numbers are very important in the implementation of hash tables. They might be used in computing a hash for a given key using a hash function. Also, they seem to be commonly used as the size of the hash table, i.e., the number of buckets in the table.

Consider, a hash table with separate chaining. When there is a collision, i.e., multiple keys map to the same bucket, they are maintained in a list at that bucket. The load factor of the hash table is the average number of keys per bucket in the table. When the load factor increases beyond a certain threshold, a new hash table with larger number of buckets is chosen, the existing keys are rehashed and inserted anew into the new hash table.

A key part of a hash table implementation is: what number of buckets to choose when increasing or decreasing the size of a hash table? In containers like vectors, this is easy: in most implementations it grows by 2x the size. But, for hash tables, I found that implementations seem to prefer prime numbers. So, how do hash table implementations pick this prime number?

Hash table in C++ STL of GCC

The containers based on hash tables in C++ STL are unordered_set, unordered_multiset, unordered_map and unordered_multimap. All of them are based on the same underlying hash table implementation. Computing an optimal prime number for a given load factor and number of keys is not easy.

Not surprisingly, the GCC 5.1 implementation of STL has a pre-computed lookup table of prime numbers. I found it in /usr/include/c++/5/ext/pb_ds/detail/resize_policy/hash_prime_size_policy_imp.hpp. Here is the array of prime numbers it uses:

static const std::size_t g_a_sizes[num_distinct_sizes_64_bit] =
{
  /* 0     */              5ul,
  /* 1     */              11ul, 
  /* 2     */              23ul, 
  /* 3     */              47ul, 
  /* 4     */              97ul, 
  /* 5     */              199ul, 
  /* 6     */              409ul, 
  /* 7     */              823ul, 
  /* 8     */              1741ul, 
  /* 9     */              3469ul, 
  /* 10    */              6949ul, 
  /* 11    */              14033ul, 
  /* 12    */              28411ul, 
  /* 13    */              57557ul, 
  /* 14    */              116731ul, 
  /* 15    */              236897ul,
  /* 16    */              480881ul, 
  /* 17    */              976369ul,
  /* 18    */              1982627ul, 
  /* 19    */              4026031ul,
  /* 20    */              8175383ul, 
  /* 21    */              16601593ul, 
  /* 22    */              33712729ul,
  /* 23    */              68460391ul, 
  /* 24    */              139022417ul, 
  /* 25    */              282312799ul, 
  /* 26    */              573292817ul, 
  /* 27    */              1164186217ul,
  /* 28    */              2364114217ul, 
  /* 29    */              4294967291ul,
  /* 30    */ (std::size_t)8589934583ull,
  /* 31    */ (std::size_t)17179869143ull,
  /* 32    */ (std::size_t)34359738337ull,
  /* 33    */ (std::size_t)68719476731ull,
  /* 34    */ (std::size_t)137438953447ull,
  /* 35    */ (std::size_t)274877906899ull,
  /* 36    */ (std::size_t)549755813881ull,
  /* 37    */ (std::size_t)1099511627689ull,
  /* 38    */ (std::size_t)2199023255531ull,
  /* 39    */ (std::size_t)4398046511093ull,
  /* 40    */ (std::size_t)8796093022151ull,
  /* 41    */ (std::size_t)17592186044399ull,
  /* 42    */ (std::size_t)35184372088777ull,
  /* 43    */ (std::size_t)70368744177643ull,
  /* 44    */ (std::size_t)140737488355213ull,
  /* 45    */ (std::size_t)281474976710597ull,
  /* 46    */ (std::size_t)562949953421231ull, 
  /* 47    */ (std::size_t)1125899906842597ull,
  /* 48    */ (std::size_t)2251799813685119ull, 
  /* 49    */ (std::size_t)4503599627370449ull,
  /* 50    */ (std::size_t)9007199254740881ull, 
  /* 51    */ (std::size_t)18014398509481951ull,
  /* 52    */ (std::size_t)36028797018963913ull, 
  /* 53    */ (std::size_t)72057594037927931ull,
  /* 54    */ (std::size_t)144115188075855859ull,
  /* 55    */ (std::size_t)288230376151711717ull,
  /* 56    */ (std::size_t)576460752303423433ull,
  /* 57    */ (std::size_t)1152921504606846883ull,
  /* 58    */ (std::size_t)2305843009213693951ull,
  /* 59    */ (std::size_t)4611686018427387847ull,
  /* 60    */ (std::size_t)9223372036854775783ull,
  /* 61    */ (std::size_t)18446744073709551557ull,
};

You can create a simple C++ program that inserts keys into an unordered_set and then check the number of buckets using the bucket_count method. You will find that it will be one of the above listed prime numbers.

Hash table in .Net (C#)

Now that the .Net source code is available, I also checked out its System.Collections.HashTable implementation. It too seems to be using a lookup table of prime numbers for the table size. Here is the list from its source code:

// Table of prime numbers to use as hash table sizes. 
// A typical resize algorithm would pick the smallest prime number in this array
// that is larger than twice the previous capacity. 
// Suppose our Hashtable currently has capacity x and enough elements are added 
// such that a resize needs to occur. Resizing first computes 2x then finds the 
// first prime in the table greater than 2x, i.e. if primes are ordered 
// p_1, p_2, ..., p_i, ..., it finds p_n such that p_n-1 < 2x < p_n. 
// Doubling is important for preserving the asymptotic complexity of the 
// hashtable operations such as add.  Having a prime guarantees that double 
// hashing does not lead to infinite loops.  IE, your hash function will be 
// h1(key) + i*h2(key), 0 <= i < size.  h2 and the size must be relatively prime.
public static readonly int[] primes = {
    3, 7, 11, 17, 23, 29, 37, 47, 59, 71, 89, 107, 131, 163, 197, 239, 293, 353, 431, 521, 631, 761, 919,
    1103, 1327, 1597, 1931, 2333, 2801, 3371, 4049, 4861, 5839, 7013, 8419, 10103, 12143, 14591,
    17519, 21023, 25229, 30293, 36353, 43627, 52361, 62851, 75431, 90523, 108631, 130363, 156437,
    187751, 225307, 270371, 324449, 389357, 467237, 560689, 672827, 807403, 968897, 1162687, 1395263,
    1674319, 2009191, 2411033, 2893249, 3471899, 4166287, 4999559, 5999471, 7199369};

The prime numbers in .Net are different from that in GCC STL. I’m guessing they are tuned for the .Net load factors, languages and virtual machine.

A conclusion we can draw from these observations is that though hash tables easily beat balanced binary search tree (BST) in lookup performance, they are not any easier to implement. Especially a hash table that is built for general purpose applications and data sizes.

How to warn on shadow declaration

Shadow declarations are a major source of misunderstanding and bugs in C or C++ code. C and C++ allow shadow declarations, but this is one of those features that is better not used in production code that needs to be understood and maintained by multiple people.

  • Use the -Wshadow compiler option to be warned about shadow declarations.

  • Note that this option is not part of -Wall or -Wextra, so you will need to explicitly add this to your build.

  • You might find that header files you included from external libraries might throw this warning. (I have faced this with the OpenNI library.) In such a case, ignore this warning only for that particular header file inclusion by using diagnostic pragma, as described here .

Tried with: GCC 5.1 and Ubuntu 14.04

How to disable assert in GCC

Using assert in C or C++ code is an excellent method to ensure correctness of pre-conditions and post-conditions in functions. Assertions are enabled by default in code compiled with GCC C and C++ compilers.

To disable assert, NDEBUG must be defined. You can choose to define this in your code as #define NDEBUG. Or you can pass -DNDEBUG to the GCC compiler while compiling the code.

If you use CMake, you can add it to CMakeLists.txt like this:

add_definitions(-DNDEBUG)

Tried with: CMake 2.8.12.2, GCC 4.9.2 and Ubuntu 14.04

How to switch GCC version using update-alternatives

Multiple versions of GCC can be installed and used on Ubuntu as described here. The update-alternatives tool makes it easy to switch between multiple versions of GCC.

On Ubuntu, gcc and g++ are just symbolic links to the actual binaries of a specific version of GCC. By switching the version, invoking gcc will execute the particular version of the compiler binary that you wish. You can make any of these version as the default at any time effortlessly.

As an example, I had installed GCC version 4.8 from the Ubuntu repositories. This was the default version of GCC, so gcc was a symlink to gcc-4.8 binary. Wanting to use some new C++11 features I installed version 4.9 of GCC. This compiler can be invoked using gcc-4.9. I now want to switch the default gcc to invoke gcc-4.9. I also want the freedom to switch back 4.8 as the default whenever I want. You can switch the symlinks yourself manually, but using this tool makes it easy and clean.

Let us begin:

  • Decide which set of symbolic links you want to group together as one unit. I like to switch /usr/bin/gcc and /usr/bin/g++ together.

  • Pass update-alternatives the first version of these symbolic links. Here I will inform about the 4.8 version of these tools and links:

$ sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.8 100 --slave /usr/bin/g++ g++ /usr/bin/g++-4.8

Here, we have provided the gcc as the master and g++ as slave. Multiple slaves can be appended along with master. When master symbolic link is changed, the slaves will be changed too.

  • Pass the second version of these tools to be recorded:
$ sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.9 50 --slave /usr/bin/g++ g++ /usr/bin/g++-4.9 
  • Now you can switch between these versions by using:
$ sudo update-alternatives --config gcc

Tried with: Ubuntu 14.04

How to install PyCUDA

PyCUDA enables a Python program to pass data to and call CUDA kernels from a Python program. Getting it to install and work correctly on Ubuntu took a bit of work.

  • I tried installing the nvidia-343 drivers for Ubuntu. But, it turns out that only 340.x or earlier drivers support the GTS 250 I use for display (not for compute). So, I reverted back and installed nvidia-331 drivers. Note that the nvidia-331-updates driver will not work with CUDA either. Do not ask me why! ๐Ÿ™‚

  • Installing the CUDA toolkit was easier, just install nvidia-cuda-toolkit package.

  • PyCUDA is available in Ubuntu as a python-pycuda package. But, that is the very old 2013.1.1 version. Instead I installed it from the Python Package Index (PyPI):

$ sudo pip install pycuda

The install script kept complaining about the absence of a configure.py script, but it seemed to end with success.

Tried with: PyCUDA 2014.1, CUDA 5.5, NVIDIA 331 driver and Ubuntu 14.04

CUDA error: gcc 4.9 and up are not supported

Problem

I have seen this error while compiling a CUDA program. I have also seen this error on running a PyCUDA program:

#error -- unsupported GNU version! gcc 4.9 and up are not supported!

Solution

I was using GCC 4.9.2 on this system, but I did not know why CUDA had a problem with these newer versions. Turns out that it does not have any problem, this warning just needs to be choked in the relevant header file.

For CUDA, this header file was /usr/local/cuda/include/host_config.h. For PyCUDA, this header file was /usr/include/host_config.h.

This diff should fix this error:

--- host_config.h.bkp   2015-02-24 09:53:55.232620612 +0800
+++ host_config.h   2015-02-24 10:24:01.428654521 +0800
@@ -77,11 +77,11 @@

 #if defined(__GNUC__)

-#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ > 8)
+#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ > 9)

 #error -- unsupported GNU version! gcc 4.9 and up are not supported!

-#endif /* __GNUC__> 4 || (__GNUC__ == 4 && __GNUC_MINOR__ > 8) */
+#endif /* __GNUC__> 4 || (__GNUC__ == 4 && __GNUC_MINOR__ > 9) */

 #endif /* __GNUC__ */

Tried with: PyCUDA 2014.1, CUDA 6.5, GCC 4.9.2 and Ubuntu 14.04

Warning options of GCC

The GCC and G++ compilers have many warning options that can be useful to improve the quality of your code. All the warning options are listed and described here.

Since the list of warnings is very long, GCC provides a few warning options that are a shortcut for enabling a larger bunch of warnings. These warning options are -Wall, -Wextra and -Wpedantic.

The warnings enabled by -Wall and -Wextra is listed along with their description. However, this is not provided for -Wpedantic. I went through the individual warning descriptions and found the warning options enabled by -Wpedantic:

Warning options enabled by -Wpedantic:

-Wpointer-arith
-Wlong-long
-Wvariadic-macros
-Wvla
-Woverlength-strings

It can very useful to know which of the warning options are not enabled by -Wall, -Wextra and -Wpedantic. These are the options that might be useful for debugging or improving particular problems with your code. Again, I went through the individual warning descriptions and here they are:

Warning options not included by -Wall, -Wextra and -Wpedantic:

-Wdouble-promotion
-Wmissing-include-dirs
-Wswitch-default
-Wswitch-enum
-Wswitch-bool
-Wsync-nand
-Wsuggest-attribute=[pure|const|noreturn|format]
-Wsuggest-final-types
-Wsuggest-final-methods
-Wsystem-headers
-Wtrampolines
-Wfloat-equal
-Wtraditional
-Wtraditional-conversion
-Wdeclaration-after-statement
-Wundef
-Wshadow
-Wlarger-than=len
-Wframe-larger-than=len
-Wstack-usage=len
-Wunsafe-loop-optimizations
-Wbad-function-cast
-Wc90-c99-compat
-Wc99-c11-compat
-Wc++-compat
-Wcast-qual
-Wcast-align
-Wwrite-strings
-Wconditionally-supported
-Wconversion
-Wzero-as-null-pointer-constant
-Wdate-time
-Wdelete-incomplete
-Wuseless-cast
-Wjump-misses-init
-Wsign-conversion
-Wfloat-conversion
-Wlogical-op
-Waggregate-return
-Wstrict-prototypes
-Wold-style-definition
-Wmissing-prototypes
-Wmissing-declarations
-Wnormalized[=<none|id|nfc|nfkc>]
-Wpacked
-Wpadded
-Wredundant-decls
-Wnested-externs
-Winline
-Winvalid-pch
-Wvector-operation-performance
-Wdisabled-optimization
-Wstack-protector
-Wunsuffixed-float-constants

Note that I have not included the warning options that disable certain types of warning messages.

Finally, here is the list of warnings that I like to compile with:

-Wall
-Wextra
-Wpedantic
-Wconversion
-Wmissing-include-dirs

Tried with: GCC 5.0.0 online documentation

Unused parameter warning of GCC

GCC throws a unused parameter warning if you are compiling your code with one of these options: -Wunused-parameter, -Wunused or -Wextra.

The warning message looks like this:

foo.cpp:43:40: warning: unused parameter โ€˜offโ€™ [-Wunused-parameter]
 string IntToString(int num, int off)
                                 ^

In most scenarios, this warning is good since you find out that maybe this parameter is not being used currently, so it can be removed from the definition, declaration and the calls.

However, sometimes the function definition cannot be changed. For example, if the function is a callback that is required by a library. In such a case, the function signature cannot be changed even if you are not using that parameter inside your callback.

There is a simple solution for this scenario: remove the parameter name, just keep the parameter type. For example, to fix the function shown above, you could change its definition to:

string IntToString(int num, int)
{
    // Code of function here
}

This keeps the function signature, so it still works with the library. But since the parameter name is gone, you no longer get the warning. ๐Ÿ™‚

Tried with: GCC 4.9 and Ubuntu 14.04

How to make GCC stop after first error

During compilation of a source file, GCC tries to show as many error messages as possible until it cannot make any sense of the input. This is the default behavior. For certain files, the error messages can be huge wall of text that should be navigated to find the first or first few errors. So, sometimes I would like GCC to just show me the first error and stop there.

To make GCC stop after displaying the first error:

$ gcc -Wfatal-errors foo.c

$ g++ -Wfatal-errors foo.cpp

If you use CMake to compile your code, then this directive can be added to CMakeLists.txt:

add_definitions(
    -Wfatal-errors
    )

Tried with: GCC 4.9 and Ubuntu 14.04