Code Yarns ‍👨‍💻
Tech BlogPersonal Blog

Architecture Identification Macro in CUDA

📅 2011-Mar-04 ⬩ ✍️ Ashwin Nanjappa ⬩ 📚 Archive

The __CUDA_ARCH__ macro is defined only for device code. It is a macro that identifies the CUDA architecture version that the NVCC compiler is compiling the code for. When defined, it holds a 3-digit value of the form xy0, which indicates a particular CUDA architecture. For example, sm_20 is stored as 200, sm_13 as 130 and so on.

This macro is useful to compile different code for host and device:

__host__ __device__ void foobar()
{
#ifdef __CUDA_ARCH__
    // Device specific code
#else
    // Host specific code
#endif
}

This macro is also useful to write device code that is specific to particular architectures:

__global__ void fooKernel()
{
#if __CUDA_ARCH__ >= 200
    // Code that uses advanced features
    // introduced in sm_20 and later
#else
    // Fallback code for older architectures
#endif
}

Reference: Section on Virtual Architecture Identification Macro in the NVCC Manual.

Tried with: CUDA 5.5 and Ubuntu 12.04 LTS