📅 2011-Mar-04 ⬩ ✍️ Ashwin Nanjappa ⬩ 🏷️ architecture, compiler, cuda, macro ⬩ 📚 Archive
The __CUDA_ARCH__
macro is defined only for device code. It is a macro that identifies the CUDA architecture version that the NVCC compiler is compiling the code for. When defined, it holds a 3-digit value of the form xy0
, which indicates a particular CUDA architecture. For example, sm_20
is stored as 200, sm_13
as 130 and so on.
This macro is useful to compile different code for host and device:
void foobar()
__host__ __device__
{#ifdef __CUDA_ARCH__
// Device specific code
#else
// Host specific code
#endif
}
This macro is also useful to write device code that is specific to particular architectures:
void fooKernel()
__global__
{#if __CUDA_ARCH__ >= 200
// Code that uses advanced features
// introduced in sm_20 and later
#else
// Fallback code for older architectures
#endif
}
Reference: Section on Virtual Architecture Identification Macro in the NVCC Manual.
Tried with: CUDA 5.5 and Ubuntu 12.04 LTS