Code Yarns ‍👨‍💻
Tech BlogPersonal Blog

Architecture Identification Macro in CUDA

📅 2011-Mar-04 ⬩ ✍️ Ashwin Nanjappa ⬩ 🏷️ architecture, compiler, cuda, macro ⬩ 📚 Archive

The __CUDA_ARCH__ macro is defined only for device code. It is a macro that identifies the CUDA architecture version that the NVCC compiler is compiling the code for. When defined, it holds a 3-digit value of the form xy0, which indicates a particular CUDA architecture. For example, sm_20 is stored as 200, sm_13 as 130 and so on.

This macro is useful to compile different code for host and device:

__host__ __device__ void foobar()
#ifdef __CUDA_ARCH__
    // Device specific code
    // Host specific code

This macro is also useful to write device code that is specific to particular architectures:

__global__ void fooKernel()
#if __CUDA_ARCH__ >= 200
    // Code that uses advanced features
    // introduced in sm_20 and later
    // Fallback code for older architectures

Reference: Section on Virtual Architecture Identification Macro in the NVCC Manual.

Tried with: CUDA 5.5 and Ubuntu 12.04 LTS

© 2022 Ashwin Nanjappa • All writing under CC BY-SA license • 🐘📧