Code Yarns ‍👨‍💻
Tech BlogPersonal Blog

CUDA: Template Kernels

📅 2011-Mar-14 ⬩ ✍️ Ashwin Nanjappa ⬩ 🏷️ cuda, template ⬩ 📚 Archive

CUDA kernel functions can be made generic by writing them as template kernel functions. Doing this is almost similar to writing template functions in C++.

Consider this kernel that merely writes the input integer array to the output integer array:

//////////////////////////////////////////
// FooDevice.cu
__global__ void fooKernel( const int* inArr, int* outArr, int num )
{
    const int curThreadIndex = ( blockIdx.x * blockDim.x ) + threadIdx.x;
    const int threadNum      = gridDim.x * blockDim.x;
    for ( int index = curThreadIndex; index < num; index += threadNum )
        outArr[ index ] = inArr[ index ];
    return;
}
//////////////////////////////////////////

//////////////////////////////////////////
// FooDevice.h
__global__ void fooKernel( const int*, int*, int );
//////////////////////////////////////////

Template kernels are a good solution to enable this kernel to deal with arrays of any type. Just like in C++, CUDA template kernels need to be defined in a header file. This is because the compiler generates the code for a function that is specialized for a given type. To be able to do this at compile time inside a compilation unit, the function definition needs to be completely visible.

Rewriting the above kernel as a template kernel in the header file:

//////////////////////////////////////////
// FooDevice.cu
// Nothing here
//////////////////////////////////////////

//////////////////////////////////////////
// FooDevice.h
template< typename T >
__global__ void fooKernel( const T* inArr, T* outArr, int num )
{
    const int curThreadIndex = ( blockIdx.x * blockDim.x ) + threadIdx.x;
    const int threadNum      = gridDim.x * blockDim.x;
    for ( int index = curThreadIndex; index < num; index += threadNum )
        outArr[ index ] = inArr[ index ];
    return;
}
//////////////////////////////////////////

Tried with: CUDA 3.2


© 2022 Ashwin Nanjappa • All writing under CC BY-SA license • 🐘 @codeyarns@hachyderm.io📧