Code Yarns ‍👨‍💻
Tech BlogPersonal Blog

CUDA: C++ Wrapper for cudaMalloc

📅 2011-Mar-10 ⬩ ✍️ Ashwin Nanjappa ⬩ 🏷️ cpp, cuda ⬩ 📚 Archive

A typical usage of cudaMalloc to allocate memory on the device is:

const int arraySize = 100;
Foo* fooArray       = NULL;

const int arraySpace = arraySize * sizeof( *fooArray );
cudaMalloc( &fooArray, arraySpace );

cudaMalloc is a C function and its usage in C++ is quite messy for 2 reasons. First, fooArray cannot be defined and assigned the allocated memory in the same statement. Second, a calculation of the array size in bytes is required.

A simple C++ wrapper for cudaMalloc using a template function can handle both of these problems:

template< typename T >
T* myCudaMalloc( int size )
{
    T* loc          = NULL;
    const int space = size * sizeof( T );
    cudaMalloc( &loc, space );
    return loc;
}

const int arraySize = 100;
Foo* fooArray       = myCudaMalloc< Foo >( arraySize );

I wish the function could deduce the type without requiring its to be explicitly specified, as myCudaMalloc< Foo >. But, that would mean deducing the type based solely on the function return type, which is not possible in C++.

Tried with: CUDA 3.2